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Tlie  simplex  method  for  solving  linear  programs  has  achieved 
its  overwhelming  success  over  the  years  due  to  the  advances  during 
the  last  three  decades  on  extending  the  effectiveness  of  the  basic 
algorithm  introduced  by  Dantzig  in  1951.   However  the  largest 
problem  that  can  be  economically  solved  by  this  method  is  dictated 
by  the  size  of  the  basis.   The  limitation  being  the  ability  to 
maintain  a  sparse  yet  accurate  representation  of  the  basis  inverse. 
Commercial  computer  codes  using  the  recent  developments  of  the 
simplex  method  are  capable  of  solving  mathematical  program  systems 
of  tlie  order  of  10,000  rows.   However,  larger  linear  programs  do 
arise  in  practice.   llie  reason  why  they  are  not  being  set  up  is 
because  none  of  the  production  LP  codes  can  solve  such  large 
problems  in  a  timely  and  cost  effective  fashion.   For  such  large 


problems  iterative  tecliniques  that  do  not  require  a  basis  may  be 
an  attractive  alternative. 

We  review  two  classes  of  iterative  techniques — relaxation 
methods  and  subgradient  optimisation.   Solving  a  linear  program 
can  be  shown  to  be  identical  to  finding  a  point  of  a  polyhedron  P, 
the  polyhedron  being  formed  from  primal  and  dual  constraints  and  an 
additional  constraint  using  strong  duality.   In  relaxation  methods 
we  successively  find  points  of  relaxed  forms  P  c  p'^_  That  is,  at 
iteration  n  we  find  an  x  c  P  such  that  P  <=  P  ,   We  assume  P  ^  0. 
The  problem  is  solved  when  the  sequence  converges  to  a  point 
X*  £  P. 

Subgradient  methods  are  an  alternative  class  of  iterative 
techniques  for  solving  these  problems.   In  subgradient  methods  we 

minimize  a  convex  though  not  necessarily  dif ferentiable  function 

^  /  \  ,    ,   •      •      ,      ri+1    n    n  n    ,      n    „  . 
f(*)  by  the  iterative  scheme  x    =  x  -  t  u  ,  where  t   >  0  is  a 

a  scalar  obtained  by  the  approximate  minimization  of  f(x  -  t  u  ), 

and  u  is  a  subgradient  of  f  at  x  .   VJhen  we  use  the  minimization 

method  for  finding  a  point  of  a  polyhedron  P,  the  function  f  is 

so  defined  that  if  x*  is  a  minimum  of  f  tlien  x'''  G  P. 

These  iterative  techniques  work  with  original  data;  thus 
advantages  of  supersparsity  can  be  fully  realized  and  the  program 
run  in  core.   Also  they  are  self  correcting  and  knowledge  of  a 
vector  close  to  the  solution  can  be  used  to  advantage. 

We  sliow  that  a  generalized  form  of  Merzlyakov's  relaxation 
procedure  subsumes  most  of  the  recent  subgradient  procedures  when 
the  objective  is  to  find  a  point  of  a  polyhedron.   A  new  algorithm 


using  relaxation  method  is  presented.   Tliis  algorithm  v;as  coded  to 
explore  values  for  the  relaxation  parameters  used  in  the  algorithm. 


CHAPTER  1 
THE  PROBLEI-l:   WHY  ITERi\TIVE  TECHNIQUES? 

Mathematical  programming  has  achieved  its  present  popularity 
in  a  large  measure  due  to  the  success  of  the  simplex  method  for 
solving  linear  programs  published  by  Dantzig  [6 J  in  1951.   Over  the 
years  considerable  effort  has  been  expended  towards  exploiting  and 
extending  the  effectiveness  of  this  procedure.   We  will  briefly 
outline  some  of  the  improvements. 

Most  practical  linear  programs  have  a  sparse  matrix  [2]  and  it 
is  necessary  to  exploit  this  sparseness  to  reduce  memory  and  time 
requirements  and  to  maintain  accurate  information.   The  first  such 
procedure  was  the  revised-simplex  procedure.   This  procedure 
operates  only  on  original  problem  data  and  requires  an  inverse  of 
the  current  basis.   If  there  are  m  constraints  and  n  variables 
(including  slacks,  surplus  and  artificials)  then  the  regular 
simplex  procedure  requires 

(1.1)  (m  +  l)(n  +  1) 

storage  locations.   lliere  are  m  +  1  rows  (including  the  objective 
and  n  +  1  columns  (including  the  right-hand  side).   In  the  revised- 
simplex  procedure  we  need 

(1.2)  p(ra  +  l)(n  +1)  +  m^ 

where  p  is  the  problem  density.   (1.1)  is  much  larger  than  (1.2) 
if  p  is  close  to  zero. 


It  was  also  discovered  that  if  the  inverse  matrix  were  stored 
in  product  form  [29]  an  additional  decrease  in  storage  requirements 
could  be  realized.   However,  in  product  form,  the  basis  inverse 
grows  in  size  at  each  iteration  so  that  eventually  we  must  either 
run  out  of  room  or  scrap  the  current  form  of  the  basis  inverse  and 
reinvert  the  current  basis.   In  practice  this  happens  quite  often. 
Using  product  form  inversion,  the  storage  requirements  are 
(1.3)      p(m  +  l)(n  +  1)  +  qmr 

where  q  is  the  average  density  of  the  eta  vectors  needed  in  the 
product  form  inverse  and  r  is  the  average  number  of  eta  vectors. 
Notice,  though,  that  to  gain  the  decrease  in  (1.3)  over  (1.2)  we 
must  spend  time  in  tlie  reinversion  step. 

It  was  soon  realized  that  there  are  many  ways  of  forming  a 
product  form  of  a  basis  matrix.   For  instance,  the  matrix  could 
first  be  decomposed  into  the  product  of  two  triangular  matrices. 
The  eta  vectors  for  these  are  sparse  and  easily  formed.   In 
addition,  within  a  class  of  procedures  for  viewing  the  original 
basis,  one  can  choose  the  row-column  pivot  order  to  minimize  the 
creation  of  new  members  —  i.e.  one  can  choose  the  order  in  forming 
the  eta  vectors  to  minimize  the  density  of  the  resulting  basis 
inverse  representation  [20].   A  related  issue  has  to  do  with 
updating  the  currect  basis  inverse  with  a  new  eta  vector.   One  can 
perform  this  operation  with  a  minimal  buildup  in  storage 
requirements  [14]. 

Kalan  [22]  pointed  out  that  of  the  p(m  +  l)(n  +  1)  elements  of 
a  typical  LP  problem,  most  are  identical  (usually  most  are  +  Is). 


Kalan  suggested  storing  only  the  unique  numbers  of  an  LP  problem 
and  using  pointers  to  store  the  problem  structure.   In  so  doing 
one  stores  fewer  elements  than  exist  in  the  problem.   The  new 
sparseness  is  termed  supersparsity .   Kalan  noted  that  not  only  can 
original  data  be  stored  this  way  but  also  new  data  (formed  in  tlie 
basis  inverse)  can  be  stored  in  tliis  fashion. 

Another  useful  idea  —  scaling  of  data  —  has  numerical 
stability  for  its  motivation.   Generally  for  real  problems  the 
coefficients  of  matrix  may  take  on  a  wide  range  of  values.   ITiis 
can  cause  numerical  difficulties  —  for  example  in  inversion  — 
since  the  computer  must  round  off  fractions.   To  reduce  the  effect 
of  this  problem,  the  matrix  entries  are  scaled  before  starting  the 
problem. 

There  are  a  number  of  ways  in  which  to  obtain  a  starting 
basis:  from  scratch  using  an  "artificial  basis,"  using  a  CRASHed 
basis  forcing  some  of  the  variables  into  the  basis,  or  based  on 
information  about  the  problem.   The  last  approach  is  especially 
useful  in  solving  a  series  of  problems  with  only  minor  variations 
in  data. 

Candidate  selection  can  play  an  important  part  in  the 
efficiency  of  an  LP  algorithm.   There  are  a  number  of  ways  to  select 
the  entering  variable  —  "multiple  pricing"  is  one  of  them  which  is 
especially  useful  for  large  problems.   In  this  approach  we  typically 
divide  the  variables  into  segments  and  pick  tlie  segments  sequentially. 
In  the  chosen  segment  we  select  the  g  most  negative  reduced  cost 
variables  as  candidates  for  pivoting.   We  concentrate  on  these  g 
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non-basic  variables  for  the  remainder  of  some  h  iterations.   And 
then  repeat  the  procedure  with  the  next  segment.   A  more  recent 
development  concerning  pivot  selection  is  due  to  Paula  Harris  [17]. 
The  reduced  costs  here  are  weighted  prior  to  selection.   The 
weights  are  chosen  with  respect  to  a  fixed  set  of  non-basic 
variables.   The  idea  is  that  in  the  traditional  most  negative 
reduced  cost  approach  we  are  trying  to  find  out  the  best  candidate 
but  on  a  local  basis  —  locally  we  move  in  the  direction  of 
steepest  ascent  of  the  objective  function.   Such  a  selection  may 
not  serve  global  interests.   By  constantly  adjusting  the  pricing 
by  weights  derived  from  a  fixed  reference  point,  the  global 
interests  of  the  problem  are  hopefully  maintained.   Experimental 
results  corroborate  this  expectation  and  show  it  to  be  a  very 
effective  procedure. 

When  there  is  degeneracy  in  a  linear  programming  problem, 
there  exists  the  possibility  of  a  set  of  basic  feasible  solutions 
occurring  repeatedly  —  called  "cycling."   Bland  [3]  has  introduced 
several  new  methods  to  avoid  cycling  —  the  simplest  of  which  is 
to  select  as  entering  variable  one  with  the  smallest  subscript 
among  all  candidates  to  enter  basis.   If  there  is  a  tie  for  leaving 
variable,  select  the  one  with  the  smallest  basic  subscript. 

Parallel  with  efforts  to  increase  the  size  of  problems  that 
can  be  tackled  by  the  simplex  method  have  been  efforts  to  increase 
its  capabilities  for  problems  having  special  structure.   llie 
computing  effort  using  the  simplex  method  depends  primarily  on  the 
number  of  rows;  so  it  is  worthwhile  considering  the  possibility  of 


taking  care  of  some  constraints  without  increasing  tlie  size  of  the 
basis.   Generalized  Upper  Bounding  (GUB)  technique  introduced  by 
Dantzig  and  Van  Slyke  [9]  achieves  this  for  contraints  of  the  form 
(l.A)     Zx.  =  b. 

with  the  restriction  that  the  same  variable  does  not  occur  in  more 
than  one  GUB  constraint.   Decomposition  technique  is  another  method 
of  extending  the  capability  of  problem  solving  by  breaking  it  into 
smaller  subprobleiiLS  and  using  a  master  program  iteratively  to  tie 
them  together.   Unfortunately  the  convergence  of  this  method  is 
rather  slow  and  in  addition  one  is  unable  to  compute  bounds  on  the 
value  of  the  objective  function  at  each  iteration,  unless  each  of 
the  subproblems  is  bounded. 

Despite  the  far  reaching  advances  made  during  the  last  three 
decades  in  extending  the  capability  of  the  simplex  method,  some  of 
which  have  been  mentioned  above,  the  basis  inverse  still  remains 
the  limiting  factor  dictating  the  size  of  the  problem  that  could 
be  tackled.   Even  when  we  use  variants  like  GUB  and  decomposition 
techniques,  the  size  of  the  basis  is  still  the  bottleneck.   In  the 
case  of  GUB  it  is  the  basis  of  the  master  problem  consisting  of 
other  than  GUB  constraints  and  with  decomposition  it  is  the  size  of 
the  basis  of  the  largest  of  subproblems.   Thus  this  basis  inverse 
is  the  heart  of  all  that  is  bad  in  simplex  method  related  techniques 
of  solving  an  LP. 

Commercial  computer  codes  using  above  developments  of  simplex 
method  are  capable  of  solving  mathematical  programming  systems  of 
the  order  of  10,000  rows.   However,  useful  linear  programs  of  such 
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magnitude  do  arise  in  practice  for  example  in  food,  oil  and 

chemical  related  industries.   The  reason  why  larger  ones  are  not 

being  set  up  and  solved  is  that  no  production  LP  code  has  been 

developed  to  solve  such  large  problems  in  a  timely  and  cost 

effective  fashion.   Tliis  is  not  to  say  that  there  are  not  large 

linear  programming  production  codes.   For  example,  IBM's  MPSX/370, 

CDC's  APEX,  Honeywell's  LP  6000  X,  Techtronic's 

MPS3,  and  others  are  reliable  and  available  at  reasonable  rates. 

However,  one  would  not  seriously  consider  using  one  of  these  to 

solve  a  linear  program  of  10,000  or  more  rows  on  a  routine  basis 

unless  the  problem  possessed  some  nice  structure  or  the  results  of 

the  problem  solution  had  a  high  economic  value. 

Dantzig  et.  al.  [8]  in  their  justification  for  the  setting  up 

of  a  system  optimization  lab  for  solving  large  scale  models  have 

noted : 

Society  would  benefit  greatly  if  certain 
total  systems  can  be  modeled  and  successfully 
solved.   For  example,  crude  economic  planning 
models  of  many  developing  countries  indicate 
a  potential  growth  rate  of  10  to  15%  per  year. 
To  implement  such  a  growth  (aside  from  political 
differences)  requires  a  carefully  worked  out 
detailed  model  and  the  availability  of  computer 
programs  that  can  solve  the  resulting  large- 
scale  systems.   The  world  is  currently  faced 
with  difficult  problems  related  to  population 
growth,  availability  of  natural  resources, 
ecological  evaluations  and  control,  urban 
redesign,  design  of  large  scale  engineering 
systems  (e.g.  atomic  energy  and  recycling 
systems),  and  the  modeling  of  man's  physiological 
system  for  the  purpose  of  diagnosis  and  treatment. 
These  problems  are  complex,  and  urgent  and  can 
only  be  solved  if  viewed  as  total  systems.   If 
not,  then  only  patchwork  piecemeal  solutions  will 
be  developed  (as  it  has  been  in  the  past)  and 
the  world  will  continue  to  be  plagued  by  one  crisis 
after  another  caused  by  poor  planning  techniques. 


Tlie  bottleneck  in  solving  such  large  unstructured  LPs  is  the 
inability  to  maintain  a  sparse  yet  accurate  representation  of  the 
basis  inverse.   When  the  simplex  method  is  not  computationally 
feasible  for  such  large  problems,  iterative  techniques  that  do  not 
require  a  basis  inverse  may  be  an  attractive  alternative.   In  this 
dissertation  we  concentrate  on  iterative  procedures  for  solving 
linear  programs. 

VJe  discuss  iterative  techniques  under  the  headings  (a) 
relaxation  methods  and  (b)  subgradient  methods.   Tliese  methods  are 
insensitive  to  the  number  of  constraints.   In  fact  an  increase  in 
the  number  of  constraints  for  the  same  problem  improves  the 
convergence  of  the  relaxation  methods.   Unlike  the  simplex  method, 
iterative  techniques  are  self-correcting.   We  always  use  original 
data  thus  advantages  of  supersparsity  [22]  can  be  fully  realized 
and  the  program  run  in  main  memory.   Finally  knowledge  of  a  vector 
close  to  the  solution  can  be  used  to  advantage.   This  is  very 
useful  when  a  sequence  of  problems  has  to  be  solved  with  only 
slightly  varying  data. 

As  an  aside  to  the  main  thrust  of  this  dissertation,  we  note 
that  iterative  methods  are  more  attractive  for  solving  certain 
large  Markov  decision  processes  and  also  for  problems  having  a 
Leontief  Substitution  constraint  set  [23,  25]-both  of  these  problems 
are  an  important  special  class  of  LPs. 


CHAPTER  2 
BACKGROUND  RESULTS  AND  TOOLS 

Introduction 
Motivation 

In  this  chapter  we  will  review  various  iterative  techniques 
for  finding  a  point  of  a  polyhedron.   We  are  interested  in  finding 
a  point  of  a  polyhedron  because  a  wide  variety  of  important 
mathematical  programs  can  be  reduced  to  such  a  problem.   For 
example,  solving  a  linear  program  could  be  reduced  to  finding  a 
point  of  a  polyhedron  using  primal  and  dual  constraints  along  with 
an  additional  constraint  implying  strong  duality.   To  elucidate  this 
further,  consider  the  following  primal-dual  linear  programming  pair: 

(2.1)  Primal  Dual 
Max  c' X              Min  v'b 

s.t.   Ax^b         s.t.   v'A^c 
X  >0  V  >  0 

Here  A  is  m  x  n;  c  is  n  x  1  and  b  is  m  x  1 .   v'  stands  for  transpose 
of  V.   Problem  2.1  can  be  restated  as 

(2.2)  -Ax  +  b  ^  0 
v'A  -  c  ^  0 

v'b  -  c'x  ^  0 
V  ^  0,  X  ^  0 


Let  1'  =  {(^)-     (^)  satisfies  2.2}  be  the  set  of  solutions  to  2.2. 

(^)C  P  implies  that  x  solves  the  Primal  problem  in  2.1  and  v  solves 

the  dual  problem. 

Linear  programs  of  the  form 

z*  =  Max  Min  (alx  +  b.) 

X   i=l,  .  . . ,ra 

relate  directly  to  the  problem  of  finding  a  point  of  a  polyhedron  P 

where 

P=  (xe  r'^:  alx  +  b.  k   z*,    i  =  1,  ...,  m} 

Such  problems  arise,  for  example,  in  large-scale  transportation 

problems  and  decomposition  problems  [2  7].   We  will  develop  such  a 

formulation  for  the  transportation  problem. 

Given  k  origins  with  stocks  a,  >  0  of  a  single  commodity,  and 

m  destinations  with  demands  d.  >  0  the  transportation  problem  involves 

J 

determining  a  shipping  schedule  that  meets  all  demand  and  supply 
contraints  at  minimum  cost.   Let 

z„.  be  the  amount  shipped  from  £  to  j 

c    cost  of  shipping  one  unit  from  £  to  j . 
£j 

The  problem  is 

k   m 

Min  Z       T.    Cn  .    Zn  ■ 
i=l   j=l  ^J   ^J 


subject  to 


'^   Zo  .  =  ^o  »  ^  =  1.  •  •  •  ,  !<■ 
j=l  ^^  ^ 


k 

^   2p  .  =  d.  ,  j  =  1,  . . . ,  m 
£=1   ^J     J 
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^.Ij'" 


Without  loss  of  generality,  we  assume 


k        m 
Z   3.  =   Id 
£=1   ^   j=l 


If  m  and  k  are  too  large  to  be  handled  by  conventional  network 
methods  [4],  then  the  decomposition  principle  can  be  used.   Let 

{(z^.:  i  =   1,    ...,    n;  j  =  1,  ...,  m)  :  i  G  I'} 
be  a  list  of  extreme  points  of 

k 

^   z„=d.,j=l,  ...,m 
£=1   ^J     ^ 

Then    the    transportation   problem  is   equivalent    to    the   master    problem 

km  .  . 

Min      Z        L      c  Z      Zp  .    A 

£=1   j=l      ^J    id     ^J 

subject    to 

ra 
a,    -     Z        Z      Zp.    A      =   0,    £  =   1,    ...,    k 
ie  I  j  =  l        -' 


Let 


Z      A^   =   1 
iel 

A^   ^   0. 


k        m 
b .    =      Z        Z      c„  .    z.  . 


m 

— 1    _      1  y  1 

j  =  l 


The  master  problem  becomes 

Z 
iel 


Min  Z  b^A^ 
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subject  to 


-E  a.   X      =0,  J2.  =  l,  ...,k. 
iel 


Z  X  =  1 
i£l 


A^  >  0 

The  dual  to  the  above  problem  is 

Max  w 

subject  to 

k 
-L      aj  x„  +  w  g  b^ 
£=1   ^   ^ 


w  <  b   +  a  !  X 

1 


where 


a^  =  (a  ,  a    . .  . ,  a  )  G  R 

This  problem  is  equivalent  to 

Max^  Min  r  ,   ,  ,  -i 
^Jk  ...  ta.x  +  b  .  } 
xeK  lel    1     1 

Task 

Two  of  the  methods  of  finding  a  point  of  a  polyhedron  are  the 

Fourier-Mo tzkin  Elimination  Method  [7]  and  a  Phase  I  simplex  method. 

The  Elimination  Method  reduces  the  number  of  variables  by  one  in 

each  cycle  but  may  greatly  increase  the  number  of  inequalities  in 

the  remaining  variables.   For  larger  problems  tliis  procedure  is 

impractical  [26]  —  since  the  number  of  inequalities  grows  at  a  rapid 

rate.   The  Phase  I  simplex  procedure  can  also  be  used  to  find  a 

point  of  a  polyliedron.   For  very  large  problems  the  simplex  method 

breaks  down  due  to  a  practical  limitation  on  the  size  of  the  matrix 
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that  can  be  inverted  as  explained  in  the  last  chapter.   It  is  for 
problems  of  this  nature  that  iterative  methods  may  be  an  attractive 
alternative.   We  discuss  iterative  techniques  under  two  classes  — 
relaxation  methods  and  minimization  methods. 

In  relaxation  methods,  at  each  iteration  we  attempt  to  find  a 
point  of  a  relaxed  form  P   of  P.   That  is  at  iteration  n  we  find  an 
X  e  P   such  that  P  c  P  .   We  look  for  the  property  that  the  sequence 
of  points  thus  generated  is  pointwise  closer  to  P.   Minimization 
methods  attempt  to  solve  the  problem  of  minimizing  a  general,  maybe 
non-dif f erentiable,  convex  function  f(*).   The  sequence  generated 
generally  has  the  property  that  the  successive  iterates  are  closer 
to  the  level  set  of  solutions  in  distance.   X"  is  one  such  point. 
The  value  of  the  objective  function  need  not  necessarily  improve  at 
each  iteration.   When  we  use  the  minimization  method  for  finding  a 
point  of  P,  the  function  f  is  so  defined  that  if  x*  is  a  minimum  of 
f  then  X*  e  P. 

We  will  now  review  the  relaxation  methods.   As  was  mentioned 

1  ■        r-  ■  ,        ,   .      .  .     n    „n   ,     ^n  . 

earlier,  we  find  at  each  iteration  n,  a  point  x   e  P  where  P   is  a 

relaxed  form  of  P.   That  is  P  c  p  .   We  continue  finding  points 

successively  in  this  manner  till  we  can  find  an  x*  e  P.   At  least 

one  such  point  exists  by  our  assumption  that  P  7^  0. 

Definitions 

We    need   a    lew   definitions   before   some   of    tlie   methods    can   be 

described.       Let 

Ji.  (x)    =  alx  +  b. 
1  11 
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k        k 
where  i  £  I,  1^0  and  finite,  x  e  R  ,  a.  e  R  and  b.  e  R.   Without 

loss  of  generality  assume  ||  a.  ||  =  1  for  all  i,  where  ||  •  ||  is  the 

Euclidean  norm.   H.  given  by 

H.  =  {x  e  R^:    I. (x)  >   0} 
1  1 

is  the  halfspace  associated  with  i  C  I.   P  given  by 

P  =   n   11.  =  {x  e  R^:  5,.(x)  ^  0,  j  £  1}  is  the  polyhedron 
iel  "^  ■*■ 

of  interest.   E.  given  by 

E.  =  {x  C  R*^:  £.  (x)  =  0} 
1  1 

is  the  plane  associated  with  halfspace  II.  .  d(x,ll.)  given  by 
d(x,  H.)  =  inf  II  x-z  || 

Z  £  H. 

1 

is  the  Euclidean  distance  from  x  £  R   to  halfspace  H. .   d  (x)  given 
by 

d  (x)  =  max  d (x,  H . ) 

P  J- 

iel 

is  the  maximal  distance  from  x  to  P.   B  (x)  given  by 

B^(x)  {y£R^:||  x-y||  ^  r} 

is  the  ball  with  radius  r  and  center  x.   S  (x)  given  by 
r 

S^(x)  =  {yER^'iJI  x-y||  =  r} 
is  the  k-1  dimensional  sphere  of  radius  r  and  center  x 

Typical  Relaxation  Method 
Procedure 


A  typical  relaxation  method  for  finding  a  point  of  P  is 

o    k 
(1)  Pick  X  £  R  arbitrarily 


u 


(2)    If   X     e    P  stop,    otherwise    let    i'''  be    the 

most  isolated  halfspace,  i.e., 

-a!  A-  X  -  b . *  >  -a!  X  -  b.,  for  all  iel. 
1         1=1       1 

We  obtain  the  sequence  {x  }  recursively  as  follows: 

n+1    n  ,  ,  n  .  n    n, 
X    =x+A   (t-x) 

where  t  ~   ^-n      (x  )  is  the  projection  of  x  on  E.^^.   A   is  called 

the  relaxation  parameter  at  iteration  n.   Generally  A   is  constant 

across  n  =  1,  2,  ... 

Variations 

If   A      =1    for   all   n,  we   call    the   procedure   a   projection   method, 
and  when  A      =    2    for   all   n,  we   call    it   a   reflection  method.      See 
Figure    1   for   an   example   of    the   relaxation   procedure.       In    this 
example  we    use   A      =   A   =   2,    for   all   n.      At    x°,    E      is    the   plane 

corresponding   to    the   most   violated    halfspace.      We    reflect   in    C 

11  2 

to    get  X    .      At   x     we   reflect    in   E      to   get    x    .       If  we   continue 

the   procedure   we   are   in   P   in    four    iterations. 

Convergence 
Fejer-monotone   sequence 

We   are    ultimately   interested   in   finding  a   point   of   P   or   at    least 
finding  a   sequence   of   points  which   converge    to   a    point   of   P.       If    the 
sequence    terminates  with   a   point   of   P    (as    in   Figure   1)    we   have 
accomplished   our   goal.      If    the   sequence   does   not    terminate,    then  we 
wish    to   know  whether    it    converges    to  a   point   of   P.      A  well   known 
result   concerning   "Feje'r-monotone"   sequences   will  be    used    to   shed 
light    on    this    question. 
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a^x  +  b      >   0 


/ L.    E, 


FIGURE   1 
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n  k 

A  sequence   of    points    {x   }    in   R     is   said    to  be   Fejer-raonotone 

[13,  31]  with  respect  to  the  set  P  if 

. X    n+1  ,   n 
(i)   X     f   y.      and 

(ii)   II  X    -  zjl  ^  II  X  -  z||  for  all  z  £  P  and  for  all  n. 

Motzkin  and  Schoenberg 

A  well  known  result  in  analysis  is  that  if  (s  }  is  monotone, 
then  {s  }  converges  if  and  only  if  it  is  bounded.  Hence  we  see  that 
a  Fejer-monotone  sequence  always  converges  in  distance.  However,  we 
need  to  relate  this  convergence  to  the  convergence  of  sequence  {x  } . 
Before  we  relate  a  theorem  on  convergence  of  Fejer-monotone  sequence 
of  points,  we  need  a  few  more  definitions. 

Given  a  set  K  c  R  ,  the  af f ine  hull  of  K,  denoted  by  jU(K)  ,  is 

M(K)  =  {x  e  R*^:  X  =  Z  A^x^,  x^  e  K, 

ieL 

^  A   =  1,  L  finite,  A"^  c  r}  . 
ieL 

For  example,  the  affine  hull  of  a  point  is  the  point  itself  and  affine 

hull  of  a  line  segment  is  a  line  containing  the  segment.   The  affine 

3 

hull  of  a  triangle  in  R  is  the  plane  containing  the  triangle.   A 

set  P  is  affine  if  P  =  M(P) . 

k  + 

Let  M  be  affine  in  R  ,  x  a  point  of  M  and  r  e  R  be  a  positive 

real  number.   Then  we  define  S  (M, x)  as  the  spherical  surface  with 

axis  M,  center  x  and  radius  r  by 

(2.3)     S^(M,x)  =  x  +  ((M  -  x>L  n  {z  c  R*":  ||  z||  =  r}) 

Figure  2  shows  the  construction  of  a  spherical  surface.   Given  a 

point  of  affine  set  M,  M  -  x  is  the  translation  of  M  by  x.   (M  -  x)-*- 

is  orthogonal  to  M  -  x.   S  (0)  is  a  spherical  surface  with  radius  r, 

r  ' 
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which    intersects    (M  -   x)-^  at    Wo   points.      Tliese    two   points    are 
translated  back  by   x    to   give   S    (M,    x) . 
(2. A)      nieorera    (Motzkin  and    Schoenberg) 

Let    the    infinite   sequence    {x    }   be   fejer-monotone  with    respect 
to    the   set   P,    then 

Case    (a) :      If   P  has  an   interior,    the  sequence   converges    to   a 
point. 


(M   -    x)-L 


(M   -   x) 


FIGURE   2 


Case  (b):   If  P  does  not  have  an  interior,  then  the  sequence 
can  be  convergent  to  a  point  or  the  set  of  its  limit  points  niay  lie 
on  a  spherical  surface  with  axixA[(P). 

See  Figure  3  for  an  illustration.   In  case  (a)  P  has  an  interior 

0        12        3  * 

and    the   sequence   x    ,    x    ,    x    ,    x    ,    ...    converges    to   a    point   x     of    P. 

In   case    (b)    P   does   not   have   an    interior    and    the    Fejer-mono tone 

0123  -,.u  .  .     .  ■  p      u 

sequence   x    ,    x    ,    x   ,    x    ,    ...    results    in    the    two    limit   points   or    the 

sequence  lying  on  a  spherical  surface  with  axis  M(P)  and  center  y. 

y  being    the   projection   of   x     on   P 

Relaxation  Procedures 
Agmon 

We  are  now  in  a  position  to  start  our  survey  of  relaxation 

procedures.   One  of  the  earliest  published  methods  is  the  projection 

0    k 
method  due  to  Agmon  [Ij.   Here  we  specify  an  initial  vector  x  e  R 

(chosen  arbitrarily)  and  set  A  =  1  for  all  n.   At  iteration  n  if 

X   e  P  stop.   Otherwise  we  find  a  nev;  point  x    as  follows: 

n+1    n  ,  ,  n  ,  n    n. 
X    =x+A   (t-x) 

(a.   X  +  b .  )  a. 
n     ^n       In   ^n 
=  X  -  


a.   a. 


=x  -(a.   X  +b.v 

In       In)  a,- 
In 

since     ||a.    11   =    1.      Here   i      represents   a   most   violated    contraint    of 

II     1    M  n 

n 
P  at   X   . 
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Case    (a) 


S^(M(P),    y) 


13     5 

X     ,  X     ,  X     , 


M(P) 


0      2      4      6 

X  ,X  ,X  ,X         y 


Case    (b) 


FIGURE   3 
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This  procedure  yields  a  sequence  which  converges  to  a  point  of 
P.   In  the  Agmon  procedure 

P"  =  {x:  a'.  X  +  b.   ^  0} 


That  is,  P   is  the  closed  halfspace  defined  by  the  constraint  i^^  and 

P  c  ?  .   In  keeping  with  our  notation,  here  p"  is  a  relaxed  form  of  P. 

(2.-5)   -Theorem  (Agmon) 

Let  P  =  n   H.,P/(t),  Ij^i})  and  finite,  be  the  solution  set  of 
iel  ^ 

a  consistent  system  of  linear  inequalities,  and  let  {x"}  be  the 

sequence  defined  by  the  projection  method,  that  is,  A"  =  1  for  all  n. 

Tlien  either  the  process  terminates  after  a  finite  number  of  steps  or 

the  sequence  (x  }  converges  to  a  point  of  P,  x  , 


Furthermore, 


n     "" 

X   -  X 


g  2d(x°,  p)  e" 


where  0  c  (0,  1)  and  depends  only  on  the  matrix  A,  where 


A   = 


^   1 
I  ra  / 


and       m  =  1 1 1  . 

Agmon  explicitly  proved  the  convergence  for  tlie  projection 
method.   He  also  showed  that  the  results  could  be  extended  to  the 
case  A  c  (0,  2),  where  A"  =  A  for  all  n. 
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Motzkin  and  Schoenbera,  Eaves 

Motzkin  and  Schoenberg  131]  (among  others)  described  a  reflexion 
method  for  finding  a  point  of  P.   They  showed  that  if  P  has  an 
interior  then  the  sequence  terminates  with  a  point  in  P  after  a 
finite  number  of  steps.   Let  x  ^  P  be  an  arbitrary  starting  point 
and  generate  a  sequence  as  follows: 

If  x"  e  P  stop 

If  X  ?!  P  select  a  half  space  such  that  x   ^  H. 

Let  X    be  obtained  by  reflecting  x   in  E . .   After 
finitely  many  iterations,  x  will  fall  in  P  if  P  has  an 
interior. 

The  general  re  flee  ion  function  can  be  expressed  as  follows; 

k     k 
For  i  =  1,  ...  ,  m  define  f.:  K  — *  R  by 

1 

(X  -  2(a.x  +  b.)a.  if  x  ?!  H. 
Ill         1 
X  if  x  C  H. 
Let  g  ,  g„,  ...  be  a  sequence  where 


g.  e  {f  ,  ...,  f  }  for  j  =  1,  2,  . 

J      1         m 


Define  g   to  be  the  composite  function 

g"(x)  =  g  (g   ,   (...(g,(x))  ...)) 
n  n-i        1 

(2.6)   Theorem  (Motzkin  and  Schoenberg) 

Let  P  ^  0  and  I  t^  0  and  finite.   If  P  =   n   H.  has  an  interior, 

iEl   ^ 

then  for  any  x  ,  there  is  an  H   such  that 
g  (x)  c  P. 
Eaves  111]  extended  this  result  and  demonstrated  that  the 
reflection  procedure  has  a  uniform  property.   Namely,  if  x  is  within 
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I,   0, 


a 


distance  6  of  P,  then  g  (x  )  will  fall  in  P  for  some  I   where 


*      *  -  0 

%  ^   %     and  t     depends  on  o  and  not  on  x  . 

(2.8)   Theorem  (Eaves) 

Assiime    that  P   =      n     H      has    an    interior.      Let    X  be    the    set    of 
id      " 

points  within  a   distance    6  of   P,    that    is, 

X  =    {x  c   R*^:    d(x,    P)    ^    6} 

%  1 

Tlien  there  is  an  £  such  that  g  (X)  c  P .   g   is  thus  a  piecewise, 

linear  retraction  from  X  to  P . 

Piecewise  and  linear  means  that  it  has  finitely  many  linear 

pieces.   Retraction  means  g  :  X  -^-  P  is  continuous,  P  c  X  and 

g  (x)  =  X  for  X  £  P.   See  Figure  4  for  example.   Six  points 

chosen  arbitrarily  within  1"  of  P  all  converge  in  under  four 

iterations . 


Gof f in 

Definitions.   Coffin's  [15]  work  deserves  special  mention. 
He  has  provided  a  comprehensive  treatment  of  relaxation  methods 
and  presented  several  useful  finiteness  properties.   We  need  a 
few  more  definitions  before  presenting  Coffin's  results. 

A  convex  subset  F  of  P  is  a  face  of  P  if 

X.  y  e  P  'I 

l>  — ^  X,  ye  F. 
(x,  y)    \<    i   ^   ] 

We  denote  the  set  of  faces  of  P  by  F(P).  Figure  5  illustrates 
F(P)  for  a  given  P.  Zero-dimensional  faces  are  1,  2,  3.  One- 
dimensional  faces  are  4,  5,  6.   The  polytope  7  itself  is  also  a  face. 


23 


FIGURE  4 
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Tlie  zero-dimensional  faces  of  P  are  called  the  extreme  points  of  P. 
Tlie  n-1  dimensional  faces  are  called  facets  of  P. 


FIGURE  5 


N  (x)  defined  by 

Ng(x)  =  {u  £  R^:  u'(z  -  x)  <  0,  Vz  e  S} 

is  the  normal  cone  to  S  at  x.   N  (x)  is  a  non-empty,  closed  convex 

o 

cone.   It  is  non-empty  because  it  contains  at  least  the  origin.   It 
is  closed  and  convex  because  the  intersection  of  closed  halpspace  is 
closed  and  convex.   It  is  a  cone  because  for  all  u  c  N  (x) , 
Au  £  N  (x) ,  A  ^  0. 

Cg(x)  =  (Ng(x))^ 
is  the  supporting  cone  to  S  at  x.   Figure  6  illustrates  these  last 
two  definitions. 

A  point  X  of  a  set  P  is  a  relative  interior  point  designated 
r.i.(P.  )  if  it  is  an  interior  point  of  P  with  respect  to  the  relative 
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topology  induced  in  M(P)  by  the  Euclidean  topology,  i.e.,  xe  r.i.(P) 

if  there  exists  6  >  0  such  that  Br(x)  n  M(P)  c  p.   For  example,  as 

2  2 

shown  in  Figure  7,  a  line  segment  in  R  has  no  interior  in  R  but 

has  a  relative  interior. 


FIGURE  6 
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FIGURE  7 

It  can  be  shown  that  the  normal  cone  is  the  same  for  all 
relatively  interior  points  of  a  given  face,  i.e., 
Np(x)  =  Np(F)  for  any  X  c  r.i.(F) 


where  F  is  a  face  of  P.   Similarly 


Cp(x)  =  Cp(F)  for  any  x  E  r.i.(F). 

Let  T  be  any  set,  then  -T  is  defined  as 

-T  =  {x  e  R  :  X  =  -y,  y  £  T} 

A  closed  convex  set  is  said  to  be  smooth  enough  at  a  point  v  of 
its  boundary  if 

-Np(y)  c  Cp(y) 
A  closed  convex  set  is  said  to  be  smootlLenou^,_  if  it  is  smooth 
enough  at  every  boundary  point,  or  equivalently  if 

-Np(F)  c  Cp(F)       ¥-F£  F(P)  -  0 
where  F(p)  stands  for  the  set  of  faces  of  P  and  F(P)  -   cl>   is    the  set 
F(P)  with  0  removed. 
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Tlie  smooth  enough  property  applied  to  a  polyliedron  would  require 
all  its  "corners"  to  be  "smooth  enough,"  carrying  the  analogy  to 
k-dimensional  Euclidean  space. 

Some  characteristics  of  smooth  enough  convex  sets  are  mentioned 
below: 

A  polyhedral  cone  C  is  smooth  enough  if  and 

P  P 

only  if  (iff)  -C  "^   C,  where  C   is  the  polar  of  C. 

c''={u  e  R*^:  u'z  ^0,  Vz  e  C}. 

A  polytope  (bounded  polyhedron)  is  smooth 

enough  iff  all  its  extreme  points  are  smooth 

eno  ugh . 

A  polyhedron  is  smooth  enough  iff  -N  (F)  <^  C  (F) 

for  ail  the  minimal  elements  of  the  poset  {F(P)  -  (}>,  c} 

where  *=  stands  for  set  inclusion. 

Some  illustrations  are  given  in  Figure  8. 

A  poset  (P,  <^)  is  a  set  P  on  which  a  binary  relation  £  is 

defined  such  that  it  satisfies  the  following  relations; 

(i)   for  all  X,  X  _<  X  (reflexivity) 

(ii)   if  X  j<  y  and  y  £  z  then  x  £  z  (transitivity) 

(iii)   if  X  _<  y  and  y   <_  x   then  x  =  y  (antisymmetry) 

Examples  of  posets  are  {R,  <] ,    i.e.,  the  reals  with  the  ordinary 

less  than  equal  to  operation,  a  dictionary  with  lexicographic 

ordering,  and  the  power  set  S,  P(S),  with  set  inclusion,  r  ,  as  the 

binary  operation. 

Given  a  unit  vector  e  e  R  and  an  angle  a  £  [0,11],  the  set  of 

vectors  which  make  an  angle  with  e  less  than  or  equal  to  a  is  called 

the  spherical  cone,  C  (e),  with  axis  e  and  half  aperture  angle  a. 
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(a)   Not  smooth  enough  at  the  extreme  points  of  P. 


(b)   Not  smooth  enough  at  any  point  of  P. 


(c)   Smooth  enough. 


FIGURE  8 
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C^(e)  =  {x  £  R^:    x'e  = 
See  Figure  9  for  an  illustration 


X  II  Cos  a} 


C^(e)  =  c{B^(e)} 

a   =  sin  a,  a  £  [0,  jr) 

2 


FIGURE  9 


Some  properties  of  C  (e)  are  given  below: 


a 


C  (e)  is  closed  for  a  e    [0,11] 


C^(e)  is  a  convex  set  for  a  e  [0,  11/2] 
For  a  £  [0,11/2)  if  we  let  v  =  Sin  a,  then 
C^(e)  =  c{B^(e)}  where 

C(S)  =  {x  £  r'':  X  =   Z  X\\    x^  £  S,  A^  >  0,  L  finite} 

i£L 

C(S)  is  the  convex  cone  hull  of  S. 
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Following  Goffin  [15]  we  define  an  obtuseness  index  u(C)  of  a  closed 

convex  cone  C  to  be  the  sine  of  the  half  aperture  angle  of  the 

largest  spherical  cone  contained  in  C.   Properties  of  V(C)  are: 

U(C)  =  sup  {Sin  a;  C  (e)  c  C,  e  e  B  (0)} 

=  sup  min(a'. e) 
eeS^(O)  id  ^ 


U(C)  >  0  iff  C  has  an  interior 


u(C)  =  1  iff  C  is  a  halfsp 


ace . 


If  C  (e)  is  a  spherical  cone  then  U(C  (e))  =  Sin  a, 


a 


a 


The  obtuseness  index  of  a  polyhedron  P  is  defined  by 

U(P)  =  inf  U(Cp(x))  =  min  U(Cp(F)) 
x£;3P  F£F(P)-i}) 

For  a  polytope  we  have 

U(P)  =  min  U(Cp(F)) 
FeVertices  of  P 

If  U(P)  =  1/42,  then  P  is  smooth  enough.   However,  tliis  is  not  a 

necessary  condition  for  P  to  be  smooth  enough.   If  P  is  the  positive 

3 
orthant  in  R  ,  then  P  is  smooth  enough  but 


i(P)  =  I//3  <  I/J2. 


The  distance  from  a  point  x  to  a  set  S  <=  r*^  is  defined  by 

d(x,  S)  =  inf  II  X  -  z|| 
zES 

If  S  is  closed,  then  the  infimum  is  attained,  and  the  set  of  points 

of  S  where  it  is  attained  is  the  set  of  points  closest  to  x  in  S 

Tg(x)  =  {y  e  S:  II  X  -  y||=  d(x,  S)} 

If  a  set  K  is  closed  and  convex,  then  T  (x)  reduces  to  a  single  point 
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which  will   also   bo    denoted   by   T    (x)  .      Tliis    point    is   cnllcd    the 

K 

projection  of  x  into  K.   Hence  T  (x)  is  a  retraction  from  R   to  K. 

K 

Let  K  be  a  closed  convex  set  in  R  .   Tlie  following  important 
result  implies  the  continuity  of  the  map  T^ 
(2.8)   Theorem  (Cheney,  Goldstein) 


K 


The  projection  map  T  satisfies  the  Lipschitz  condition  (for 

K. 


C  =    1).      That  i 


IS 


II  Tj,(x)    -    Tj^(y)||^    ||x  -   yl 
See    Figure   10    for   an   illustration; 


\\M  -  \(y)\\   <   ||x  -  yjl 


FIGURE   10 
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(2.9)-    Tlieoreni    (Goffin) 

Let  K  be  a   closed    convex  set   of    R     and    T    (x)    the   unique    closest 

K 

point  to  X  in  K.   Then 

{[Tj^(x)  +  Nj^(T^(x))J,  Tj^(x)  e  K}  is  a  partition  of  R^. 
See  Figure  11  for  an  illustration. 


FIGURE  11 


(.2.10)   Le 


mma 


Let  C  c  R  be  a  closed  convex  cone.   Then 

P       P 
T^(-C  )  c  -  c"^ 

P 
where  C   is  the  polar  of  C. 

An  alternative  formulation  of  a  spherical  surface  with  axia  M, 

radius  d(x,  M)  and  center  T^XX)  (see  2.3)  is  contained  in  the 

M 


following  le 


mma. 
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(2.11)  Lemma  (Coffin) 

k  k 

Let  M  be  affine  in  R  and  x  c  R  ,  then 

Sd(^^j^)(M.Tj^jM)  =  {y  e  r'':  II  X  -  z\\  =  ||y-,||,  v-z  e  m} 

Finally  a  well  known  result  on  norm-equivalence  is  stated 
below.   Here  we  state  the  result  in  terms  of  distances  from  x  to  P 
witli  X  i   ?. 

(2.12)  Lemma 

Let   P  =      n   II .  ,    P  /   0,    I   j^   0,    then   there   exists   a   scalar 
i£l    ^ 

0   <   y   ^   1   such    that   for  all   x  c    R 

yd(x,    P)    ^   dp(x)   ^   d(x,    P). 

This   result   is    illustrated   in   Figure   12. 

Another   Proof   of   Agmon's    Result.      ITie    following   is    a   different 

proof   for    the   result   originally   due    to   Agmon.      We  will    find    this 

useful   in  subsequent  parts   of    this   material.      In    this   proof  we 

make   use   of    the  result    that    the  sequence   generated  by    the   relaxation 

method   is   Fejer-monotone  with   respect    to   P. 


FIGURE   12 
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(2.13)   Tlieorem 

Let  P=  n   H.,P?^0,  I?^0  and  finite,  then  the  relaxation 
id   ^ 

method  generates  a  sequence  which  converges  finitely  or  infinitely 

to  a  point  x*  £  P  for  any  given  value  of  the  relaxation  parameter  A 

such  that  X  £  (0,  2)  . 

Furthermore,  we  have 

II  x"  -  x^'ll   s  2d(x",  p)  e" 

where  0   =    [1   -   A(2  -A)u^]^^^ 

A  e    [0,1) 

Proof: 

Assume   x     ?!   P 

First  we  will  show  using  Figure  13  that 

d^x'''^\  P)  =  d^x",  P)  -  A(2  -A)dp(x") 

where  d  (x  )  is  the  maximal  distance  from  x   to  P  i.e.,  d  (x  )  = 

max  d(x,  H. ) . 
iel       ^ 

T,     ,    .  ,     .    -,    n+1    n   „  ,  n+1^ 

From  the  right  triangle  x    ,  t  ,  T  (x   )  we  have: 

,2,  n+1   „.    j2.  n+1   n,  ,  ^2.  n   ^.  ._. 

d  (x    ,  P)  =  d  (x    ,  t  )  +  d  (t  ,  P)  (1) 

From  the  right  triangle  x  ,  t  ,  T  (x  )  we  have: 

,2,  n+1       ,2,n   n.,,2,n„,  ,„, 

d  (x   ,  P)  =  d  (x  ,  t  )  +  d  (t  ,  P)  (2) 


From  (1)  and  (2) 


,2,  n+1       j2.n  „,    .2,  n+1   n,    ,2,  n   n. 

d  (x    ,  P)  -  d  (x  ,  P)  =  d  (x    ,  t  )  -  d  (x  ,  t  ) 

,2   n+1        2.  n  n+1    n,  ,  ,  .  n   n.  ^ 

d  (x    ,  P)  =  d  (x  ,  P)  -  [d(x    ,  t  )  +  d(x  ,  t  )] 

r  ,  /  n  n .    ,  .  n+1    n .  , 

[d(x  ,  t  )  -  d(x    ,  t  )] 

=  d^x",  P)  -  Adp(x'')[(2-A)dj,(x")] 

=  d^(x",  P)  -  A(2  -  A)dp(x") 
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It  will  be  noticed  that  our  proof  is  independent  of  Figure  13.   The 
figure  only  helps  to  illustrate  the  ideas. 


n+I 


^^ ^ ::^ ^^ \  \ 


1  (x  )  =  1  (x   ) 
P        P 


FIGURE  13 


By  lemma  2.12  there  exists  a  y  >  0  such  that 
yd(x,  P)  ^  d^,(x)  ^  d(x,  P) 


hence 

d^x"^\    P)    <  d^x",    P)   -y^A(2  ^   A)d^x",    P) 
=   d^(x",    P)[l  -  A(2  -   A)   y^] 
(2.14)      Let     0  =    [1  -  A(2  -   A)y^]^^^ 
0=0  when   A  =   1,    y   =   1.      In   general  8   e    [0,    1).      Thus, 

d^x^+\  p)  ^  e^d^x",  p) 


or 


d(x",  P)  ^  e^'dCx^,  P) 


Case  I:   If  the  sequence  does  not  terminate 

lim  d(x",  P)  =  0 
n— K» 

Since    {x   }    is    Fejer-monotone   and   dim  P   =   n,    then    {x    }    converges    to 
X     e    8P  by   Theorem  2.4  where    3P   is    the  boundary   of   P. 
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Case    II:      If    the   sequence    terminates    in  a    finite   number   of   steps 

II  X  -    z||     <    II  x"^  -    z||,     ^  z  9    P   since    the   sequence    is 

Fejer- mono  tone  with    respect    to   P.      Clearly    ||  x*  -  T    (x")||^  ||  x"  -  T  (x)||, 
hence 

and   x^    x"  both  belong   to   the  ball   B^^^n^    p)Tp(x")    (see   Figure   14). 
Hence, 

II  x"  -   x^-ll     <   2d(x",    P) 

<  2e"d(x°,  p) 

If  the  sequence  terminates,  we  can  take  x*  to  be  the  last  point  of 
the  sequence. 


/// 


n+2 


X 


FIGURE  14 
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In  the  above  procedure,  if  1  i  A  <  2  then 
P"  =  (x:  a'.   X  +  b,.   >  0} 


That  is,  P  is  the  closed  halfspace  defined  by  the  constaint  i 
If  0  <  A  <  1,  then 


P"  =  (x:  a!   X  +  (1  -  A)b.   -  Aa !   x"  >  0} 
\  11- 

n  n      n 


1  11 

n 

.n    ,   ,   ,        „n 


and  again  P  <=  p  .   in  both  cases  P   is  a  relaxed  form  of  P  and  at 

r .     ,   n   „n 
Iteration  n  we  find  x  e  P  . 

Range  of  A  for  finite  convergence.   Using  relation  2.14  giving 

9  against  different  values  of  A  and  \i,    we  get  the  results  in  Table  1 

shown  graphically  in  Figure  15. 

TABLE  1 


p  =  1 

A 

.  1 

.5 

.75 

1 

1.5 

1.9 

e^=f(A) 

.9 

.5 

.24 

0 

.5 

.9 

y  =  '-2 

A 

.1 

.5 

.75 

1 

1.5 

1.9 

e,,=f(A) 

-2 

.98 

.9 

.87 

.87 

.9 

.98 

u  =  -'. 

A 

.1 

.5 

.75 

1 

1.5 

1.9 

e,^=f(A) 

■4 

.99 

.98 

.97 

.97 

.98 

.99 

From  Figure  15  it  would  appear  that  A  ^  1  gives  best  results  (lowest 
G).   However,  Motzkin  and  Schoenberg  dcinonstra ted  that  when  P  has  an 
interior  A  =  2  leads  to  finite  convergence.   Also  if  P  has  acute 
angles,  \i   will  be  low  and  9  high.   Tlius  an  obtuse  angle  should  lead 
to  faster  convergence.   Goffin  determined  the  range  of  values  of  A 
which  gives  finite  convergence. 


38 


FIGURE   15 
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(2.15)   Theorem  (Coffin) 

Let  P  be  a  polyhedron  with  an  interior.   Then  the  sequence 
generated  by  the  relaxation  method  converges  finitely  to  a  point  of 
P  if 

(a)  P  is  smooth  enough  and  A  £  Jl,  2J 

(b)  A  c    (A*,  2] 

where 

2 

■y*  =  ■; — ; 7^;^   and  u(P)  is  the  obtuseness  index  of  P. 

A    1  +  u(P) 

Furthermore,  if  A  >  2  then  the  sequence  either  converges 
finitely  to  a  point  of  P  or  it  does  not  converge. 

The  first  part  of  the  theorem  shows  that  if  all  the  corners 
of  P  are  smooth  enough  then  the  range  of  A  over  which  we  can  get 
convergence  in  a  finite  number  of  steps  is  [1,  2]. 

The  second  part  of  the  theorem  shows  that  for  a  polyhedron  that 
is  not  smooth  enough,  the  extent  by  which  we  can  deviate  from  A  =  2 
depends  on  the  acuteness  of  the  most  restrictive  extreme  point  of  P 
—  the  more  acute  it  is,  the  less  we  can  relax  A  from  a  value  of  2. 
Merzlyakov 

Definitions  and  Motivation.   The  final  work  on  relaxation 
methods  that  we  review  is  by  Merzlyakov  [30].   Merzlyakov 's 
method  takes  advantage  of  the  fact  that  convergence  properties 
of  relaxation  methods  improve  if,  instead  of  considering  only 
the  most  violated  constraint,  as  was  done  by  Agmon,  among 
others,  a  larger  nun±)er  of  supporting  planes  are  considered. 
We  would  like  to  emphasize  this  unusual  property  of 


AO 

relaxation  methods  —  redundancy,  in  terms  of  number  of  constraints, 
improves  the  convergence  rate.   Additional  halfspaces  have  the  effect 
of  increasing  )i  which  lowers  6,  the  convergence  ratio. 

At  each  x  e  R  let 

V(x)  =  {i:  a'.x  +  b.  <  0,  i  =  1,  .  ..  ,  m} 
That  is,  V(x)  is  the  set  of  indices  of  constraints  violated  by  x. 
A  cavity  C   is  the  set 

C   =  {x  e  R  :  V(x)  =  V}  where  V  is  a  subset  of  first  m  natural 
numbers.   A  subcavity  S  """  of  C   is  the  subset  of  all  x  G  C  which 
violate  constraint  i  e  V  no  less  thananyother  constraint  j  e  V. 
Ties  are  broken  by  the  natural  ordering.   Notice  that  subcavities 
along  with  P  finitely  partition  R  .   Figure  16  illustrates  the  above. 

Associate  with  each  subcavity  S    a  fixed  A.(j,  V)  where 

0  if  i  «!  V 
^,.(J'  V)  =  <  >  0  if  i  e  V 

^>  0  if  i  =  j  . 

2 
For  example  for  subcavity  Sr    ■,  (see  Figure  16). 

Xj_(2,  {2,  3})  =  0 

X^i2,    {2,    3})  >  0 

A^(2,  {2,  3})  >  0 


Method.   Merzlyakov's  relaxation  procedure  is  as  follows: 

Let  X   e    (0,  2)  and  x   specified.   If  at  iteration  n,  x   c  P 
stop.   Otherwise,  let 
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(2,    3} 


'{!',    2,    3} 


FIGURE    16 


A2 


(2.16)     x""*"^  =  x"  -  A[ZA.(j,V)(a!x"+b.)][ZA.(j,V)a.] 


[EA.(j,V)a.]  [ZA.(j,V)a.] 


where  x  e  ^w  • 

We  again  make  the  assumptions  that  ?   ^   0   and  without  loss  of 

generality  assume  ||a.||=l  for  each  i  =  1,  .  .  .  ,  m.   Also  assume 

without  loss  of  generality  that  ZA.(j,V)  =  1.   Goffin  showed  that  if 

P  7^  0  then  the  vector  ZA.(i,V)a.  cannot  be  zero. 

1      1 

Merzlyakov  showed  that  the  procedure  given  above  converges 

to  a  point  of  P  with  a  linear  rate  of  convergence.   If  1  ^  A  <  2 

then  we  can  say 

P"  =  {x:  ZA.(j,V)(a:x  +  b.)  >  0} 
.1       1     1  ~ 
1 

and  clearly  P  c  p".   If  0  <  A  <  1,  then 

P""  =  {x:  ZA.(j,V)Ia'.x  +  (1  -  A)b.  -  A(a!x")]  >  0} 
.11  11 

1 

and  P  c  p  .   Here  again  P   is  a  relaxed  form  of  P  and  the  sequence 

obeys  our  stipulation  for  relaxation  method,  i.e.,  at  iteration  n  we 

n    n  n 

find  x  e  P  where  P  <=  p  . 

The  relaxation  sequence  {x  }  congerges  infinitely  to  a  point 
X  c    3P,  where  3P  is  the  boundary  of  P,  only  if  the  whole  sequence 
(x  }  >   "jams"  into  x  +  N  (x  ).   By  moving  in  a  direction  that  is  a 
convex  combination  of  violated  constraints  we  force  the  sequence  out 
of  tlie  jam.   For  this  reason  Merzlyakov  method  can  be  referred  to 
as  an  antijamming  procedure.   Table  2  and  Figure  17  illustrate  this 
point.   The  table  and  figure  are  based  on  the  example  given  in 
Merzlyakov' s  paper. 
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TABLE  2 
Consider  the  following  system  of  linear  inequalities 
-X  +  lOx  +  10  ^  0 

3x^  -  lOx^  -  30  i  0 

Let  X  =  (0,  0).   We  solve  this  by 

(a)  Agmon's  method    (b)   Merzlyakov's  method 

where  X   =  3/4,  ||  a.  ||  =  1 

n 


(a)    Agmon's   method 


X  =x     -A (ax     +b.    )a.    ; 

1  11 

n  n        n 


n 
x 

1                "     J.     u 

a  .      x     +  b  . 
1                    1 
n                   n 

a . 

1 
n 

n 

n 

n 
^2 

0 

.000 

.000 

-2.873 

.287 
-.958^ 

1 

.619 

-2.064 

-1.121 

,-.099. 
*•    .995'^ 

2 

.536 

-1.228 

-1.543 

.287. 
-.958^ 

3 

.868 

-2.337 

-1.416 

,-.099. 
^    .995^ 

4 

.763 

-1.280 

-1.429 

.287. 
-.958^ 

5 

1.071 

-2.306 

-1.406 

,-.099. 
^    .995'^ 

6 

.966 

-1.257 

-1.392 

,    .287. 
-.958^ 

7 

1.266 

-2.257 

-1.377 

-.099. 
^    .995'^ 

8 

1.163 

-1.230 

-1.362 

.287. 
^-.958^ 

9 

1.456 

-2.208 

TABLE  2  (continued) 


(b)  Merzlyakov  Method 

n+1    n   ,  [ZA.(j,V)(a'.x''  +  b.)][ZA.(j,V)a.] 
X    =  X  -  A    1 1 1     1 1 

[ZA.(j,V)a.]'lLA.(j,V)a.] 
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n 

X 

ZA.(j,V)(a'.x"  +  b.) 

ZA^(j,V)a. 

[ZA.(j,V)a.]' 

n 

n 

n 
x^ 

IZA.(j,V)a.] 

0 

.000 

.000 

-2.873 

-.958^ 

1 

1 

.619 

-2.064 

-1.838 

188 
\037^ 

.037 

2 

7.622 

-.686 

-.459 

188 
\037^ 

.037 

3 

9.371 

-.342 

-.273 

-.099 
^    .995'^ 

1 

4 

9.351 

-.138 

-.125 

188 
\037^ 

.037 

5 

9.827 

-.044 

-.032 

188 
\037^ 

.037 

6 

9.949 

-.020 

-.010 

-.099 
^    .955^ 

1 

7 

9.948 

-.013 

-.008 

188 
\037^ 

.037 

8 

9.978 

-.007 

-.002 

-.958^ 

1 

9 

9.978 

-.008 

Notice  that  the  Merzlyakov  method  is  much  more  effective  than 
the  Agmon  procedure.   Convergence  to  a  point  of  P  appears  much 
faster  with  the  Merzlyakov  Method. 

Generalized  Merzlyakov  Procedure.   The  Merzlyakov  procedure 
can  be  generalized  [24]  by  allowing  the  A.'s  to  be  chosen  based  on 
the  current  point  rather  than  fixed  for  a  subcavity.   For  any  x 
where  V(x)  ^   0  let 
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11 


u 
ex; 

o 

M 
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fO  if  a!x  +  b.  >  0 

A.(x)  4 

^      1^  0  if  a'x  +  b  .  <  0 
V.         11 

where  0  <  ZA.(x)  <  3,  A.(x)  >  0  for  some  i  e  V(x)  and  at  each  n 

i 
there  is  a  j  e  V(x  )  such  that 

A.(x'')(a'.x"  +  b.)  ^ 

_J i L  >  0 

a '  X  +  b  .  , 
J*      J'' 

where  j -'  indexes  a  most  isolated  constraint  and  0  >  0. 

The  convergence  properties  of  this  method  are  described  in  the 
next  chapter. 

Typical  Subgradient  Method 
Procedure 

Let  f  be  a  convex  but  not  necessarily  dif ferentiable  function 
defined  on  R  .  u  £  R  is  said  to  be  a  subgradient  of  f  at  x  if  it 
satisfies  the  following  inequality: 

f(y)  >   f(x)  +  u'(y  -  x)  for  all  y  £  r". 
The  set  of  subgradients  of  f  at  x  is  called  the  subdif f erential  of 
f  at  X  and  denoted  by  9f(x). 

9f(x)  =  {u  e  R  :  u  is  a  subgradient  of  f  at  x} 
X  is  a  minimum  of  f  if  and  only  if  0  e  9f(x  ). 

A  typical  subgradient  algorithm  works  as  f ollov%?s  : 

(1)  choose  X  e  R  arbitrarily 

(/;  compute  a  subgradient  u   of  f  at  x  ,  i.e.,  u  C    9L(x  ). 

If  u  =  0  an  optimal  point  has  been  found.   If  not  go  to  (3). 

fi\    T?-  J   """"l     c   IT       n+l    n    n  n 
(J;  Find  X    as  follows;   x    =  x  -  t  u 
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wliere    t      is   a   scalar   generally   obtained  by    the   approximate 

rainiraization   of 

f(x"  -    tu")    for    t   ^   0. 

Go  to  (2)  with  X  replacing  x 

llie  function  f  could  be  an  arbitrary  convex  function.   If  the 

objective  is  to  find  a  point  of  a  polyhedron,  one  could  choose  a 

number  of  different  functions.   One  choice  could  be 

f(x")  =  Max  (-alx"  -  b.) 
i=l,  . .  .  ,m 

Since  f  is  the  maximum  of  a  finite  number  of  convex  functions,  f  is 

convex.   Oettli  [32J  has  given  another  interesting  way  to  define  a 

function  which  can  be  used  to  find  a  point  of  a  polyhedron.   This 

is  given  latter. 

Variations 

Reverting  back  to  the  general  subgradient  procedure  given  by 

the  iterative  scheme 

n+1    n     n  n 
X    =  X  -  t  u 

various  proposals  [16]  have  been  made  for  the  step  size  t  .   The 

earliest  one  was  by  Polyak  [34]  who  suggested  Zt  =  =°  and  lira  t  =  0. 

n— >oo 

The  merit  of  this  procedure  is  that  we  could  choose  any  arbitrary 
subgradient  u  e  8f(x  )  and  the  sequence  would  converge  to  some  x* 
with  the  stipulated  stepsize  (e.g.,  with  t   =  n^ •   However 
convergence  is  dismally  slow  and  the  method  is  of  .little  practical 
value.   Polyak  attributes  the  original  idea  of  this  method  to 
Shor  [36]. 

Another  idea  also  due  to  Polyak  [35]  is  to  use 
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(2.17)     ^^   A^IfCx")  -  f] 
t  =   -   ■ 


II  "l|2 

—        -k 

with  f  =  f(x  )  the  optimum  value  of  the  function.   Polyak  has  shown 

that  vjith  0<b,  ^A   =b„<2  the  sequence  converges  to  x  ^^;ith  a 
1    n    z 

linear  rate  of  convergence.   A  variation  of  2.17,  also  due  to  Polyak 

is  to  use  f  >  f(x  ),  i.e.,  an  overestimate  of  the  optimum  f(x  ). 

Here  again  if  0  < b  ,  =  X   =  b„  <  2  the  sequence  converges  at  a 

1  n  <i 

linear   rate  but   to    the   level   set 

S   =   {x:    f(x)    ^   f}. 

Thus  for  the  result  to  be  of  value  we  must  choose  close  enough 

overestimates  of  f(x  )  and  this  is  difficult  in  practice. 

A  final  variation  on  the  theme  of  2.17  is  to  use  an 

underestimate  f  <  f (x  ).   Eremin  [12]  has  shown  that  the  sequence 

converges  to  x   if  lim  X      =0  and  ZA   =  0°.   Here  again  convergence 

n  n 

to  the  optimal  solution  is  very  slow. 

Some  Subgradient  Procedures 
Polyak 
(2.18)  Theorem  (Polyak) 

Let  f  be  a  quasi  convex  function  to  be  minimized  over  R  .   If 

the  sequence  {x  }  is  obtained  as  follows 

n+1    n,     n    -ii     iinii 
X    =  X  -  A    u   with  A   =   u    ,  then  there  exists  a 
n  II  — Y^ii       n   "   " 
II  "II 

subsequence  (x  1^}  such  that  lim  f(x  '*-)  =  f 

k— x» 

A 

Proof:   Select  a  >  f   and  define 
S^  =  {x  e  r'^:  f(x)  i  a} 
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S   is  convex  (it  is  a  level  set).   Hence  there  exists  an  x  in  int  S 

a  a 

•k 

since  a   >    f      (see   Figure   18).      Choose  p   >   0   such    that    the    neighbor- 
hood  B    (x)  c  S    . 

■  p      a 

Suppose  x  q'  S   for  all  n 

a 


f(x) 


(a)  X  £  R 


FIGURE  18 


f(x)  -  f(x")  ^  u''  (x  -  x"),  -V-x 
since  u   is  a  subgradient  at  x  .   Thus 

f(x")  -  f(x)  g  u"'  (x"  -  x).   Also 


contours  of  f 
(b)  X  e  R^. 


f(x)  1  f(x")   ^x  e  S 


a 
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Combining  the  above 

u"  (x"  -  x)  i  f(x")  -  f(x)  ^  0,  -^x  e  S 


a 


n 


This  holds  for  a  point  x  +  l. e   B^(.x)c  S  , 


I 


n 


^                        nn^n.^pu. 
Hence  u     x      ^   u      (x  H ) 

llu"|| 


1 1    ri  II      ^      n    ,    n        ~, 
or  p|   u      I     ^   u      (x     -   x) 


„    .  II     n+1        ~   II  II     n  n        ~  m  2 

But  x  -x        =        x     -u     -x 


-II     ^        ~l|2,||     n||2  nn        ^ 

=    II  x     -    x||       +    II  u    II       -    2u      (x     -   x) 

^    II  x"-   ill  2+    ||u"||2-    2p||u"|| 

=    II  x"-  ill  2+    ||u"||2   -2pA^. 

Choose  N  such  that  A   =  p,  -V-n  =  N.   This  is  possible  since  lim  A  =0. 

n  n 

n — x» 

For   n  =    N   to   n  =    N  +  ra 

0^    ||x^'^-l-   x||2^    II  x^-   S||2  +  Ta,(A,  -   2p) 

n=N 
.,  _  N+m 

g    II  x^  -   X    II  2  -  p      Z   A^ 

n=n 

For  m  large  enough  the  right  hand  side  is  negative,  since  ZA   diverges 

—  a  contradiction.   Hence  x  e  S  . 

a 

Let  lim  a   =  f  .   Then  we  have  proved  there  exists  x  ^^  e  S 


such  that 

m 


n      ^^ 

lim   f(x  1^)  =  f '. 


Computational  experience  with  heuristic  subgradient  methods  has 
been  reported  by  Held  and  Karp  [18],  and  Held,  Wolfe,  and  Crowder  [19] 
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Held,  Wolfe,  and  Crowder  use  underestimates  to  f(x  ).   However  the 

stepslze  X     =  A  is  kept  constant  for  2m  iterations  where  m  is  a 

measure  of  the  problem  size.   Then  both  A  and  nimiber  of  iterations 

are  halved  successively  till  the  number  of  iterations  reaches  some 

value  k,  \  is  then  halved  every  k  iteration.    This  sequence  violates 

the  requirement  ZA  =  "»  and  it  is  possible  that  the  sequence  may 

n 

converge  to  a  wrong  point.   However  in  their  computational  experience 
this  did  not  happen.   Held,  Wolfe  and  Crowder  experimentally  found 
that  their  procedure  was  effective  for  approximating  the  maximum  of 
a  set  of  piecewise  linear  concave  functions.   In  their  problems 
(assignment,  multicommodity ,  and  maxflow)  the  number  of  constraints 
is  enormous,  of  the  order  of  100 1  .  In  all  these  cases  they  were  able 
to  get  linear  convergence  (which  is  assured  by  the  choice  of  stepsize) 

Held  and  Karp  used  overestimates  to  f(x  )  to  obtain  bounds  for 
a  branch  and  bound  procedure  for  solving  the  traveling  salesman 
problem.   Using  this  procedure  they  were  able  to  produce  optimal 
solutions  to  all  traveling  salesman  problems  ranging  in  size  upto 
64  cities. 
Oettli 

Oettli  proposed  a  subgradient  method  for  finding  a  point  of  an 
arbitrary  non-empty  polyhedron.   We  will  discuss  his  less  general 
method  which  can  be  used  to  solve  a  linear  program.   As  before,  using 
the  dual  and  an  additional  constraint  expressing  strong  duality,  a 
linear  jirogram  can  bo  stated  to  be  the  problem  of  finding  an  x  E  1\ 
satisfying 

5-"!'(x)  ^0     i  =  1,  .  .  .,  m 
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v^rhere    t.  (x)    represents    the  magnitude  by  which   constaint    i    is   violated 

at   x;  i  .  e .  , 

5,    (x)    =   max    {O,    -a.x  -   b    } 
i  11 

Define  £  (x)  to  be  the  vector  formed  from  violations  of 

different  constraints 

l^(x)   =  a.'^(x),    £„'^(x),  ...,  l'^(x)) 
i       Z  m 

Let  p(*)  be  an  isotone  norm  on  R  ,  i.e.,  if  0  =  x  ^  y  then 
p(x)  =  p(y).   It  may  be  noted  that  all  the  usual  norms  like  Euclidean 
and  Tchebycheff  are  isotone  norms.   Define 

d(x)  =  p(£^(x)) 
Then  clearly  x  satisfies  i.    (x)  ^  0  iff  d(x)  =  0. 

Oettli  showed  that  d(*)  is  a  convex  function  and  then  described 
an  iterative  scheme  using  subgradients  of  d(*)  that  minimizes  d(*) 
over  R  .   Tliis  scheme  is  quite  general  since  we  have  a  different 
function  for  different  norms. 
(2.18)   Lemma  fOettlil 

d(*)  is  a  convex  function. 

Proof: 

Z.    (z)  is  convex  for  all  i. 
J 

Thus   I   (azl  +  3z")    <  aS,"^(z^)   +  BS-"*"(z^)    for  a, 3   >  0,   a+3  =   1. 
Since    p   is  monotonic 

p(^^(azl  +  3z^)  <   p[a£'*"(zb  +  Bl^(z^)] 
and  since  p  as  a  norm  is  convex 

^  p[a£"^(zl)  +  p[3£'^(z^)] 

=  apU^Czl)]  +  3  p[Ji^(z^)] 


hence  d(az^  +  3z^)  ^  ad(z^)  +  3d(z^) 


/// 
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Let  X  be  given  initially.   If  x  £  P  stop.   Othen^;ise,  let 

n+1    n    d(x"j n, 

(2.19)     X    =  x ,  t(x  ) 

.  n  '   ,  n. 
t(x  )  t(x  ) 

where  t (x  )  is  any  subgradient  of  d(')  formed  according  to  Oettli's 

rule  at  x  .  Ti\e   sequence  {x  }  converges  to  a  point  of  P  with  a 


linear  rate  of  convergence, 
(2.20)   Lemma  [Oettlil 


Let  Y  £  P>  then  for  the  Oettli's  procedure 

II     n+1           II  2   <    II     n           ,1  2        d2(x") 
II  x        -   yH      ^    II  X     -  yII      -  -; 1 


t(x")||' 


Proof: 


11     ri+1            II  2         II     n        d  (x   )  ,   n,  n  2 

II  X  -   yII       =I|x     -  -^— ^ —   t(x   )    -   yII 

t(x")'t(n") 

=  II  x"'-  yII^  ^^^^^ .ml) t(x")'(x"  -  Y) 

/  n.  ,  .  n,     .  n,  ,  .  n.  ' 

t(x  )' t(x  )    t(x  )'t(x  ) 

By  definition  of  subgradient 

d(Y)  I   d(x")  +  t(x'')'(Y  -  x") 

Since  d(Y)  =  0  therefore 

,,n.  ^   ,n.i,n     . 
d(x  )  $  t(x  )  (x  -  Y) 

Collecting  the  above  results 

n+1  II  2    ,    II     n  II  2        d^(x") 


II  x"^^  -  yII  ^  -5   II  x"  -  yII      -  "   ^,  ^       „     .  IU\ 

t(x   )    t(x    ) 


(2.21)      Theorem    [Oettli] 


c"""^  -  x"||    g  e   II  x"  -  x"'||,0  <  0  <  1 


Proof: 

(2.22)  II  x"+l   -   yII  '  ^    II  X-  -   yII  '   -   '    ^f ;        „ 

t(x    )    t(x   ) 


^  II  x"-  yII 


By  triangle  inequality 


X   -  X    <    X  -  Y   +  I  Y  -  X 


5A 


^  2  II  X-  yII 
Since  this  holds  for  all  Y  E  P  including  Tp(x  ) 
II  x"  -  x'^ll  ^  2d(x^  P) 


Substituting  x   in  2.22 


1-^  /  n^ 


t(x")  t(x") 


-1       i2  /  n. 

n    "ll  2  ri    d  (x  ) 
X  -  X      11  -  


/  n  N  '  /  n .    II  n     ''  1 1  2  ' 
t(x  )  t(x  )      X   -  X 


,"./l|2„.ii!(Zi 


,2.  n 


-] 


II  "+1     '' 
or    X    -  X 


<    X   -  X 


[1  -  -  — ]2 
11    ,  ^^\2 


t(x')  t(x')    d"(x  ,  P) 
2  1 


.  rd(x)      „        ii,n.  II 
where  A  =  mf  ;  B  =  sup  ||  t(x  )  || 


n   d(x  ,  P)        n 
Oettli  has  shown  that  A  >  0  and  B  is  finite.   From  these  observations, 
the  result  follows  if  we  take 

2  1 


/// 


=u-|^n 


In  the  next  chapter  we  show  that  the  Generalized  Merzlyakov 
procedure  is  equivalent  to  the  Oettli  procedure  in  as  far  as  the 
sequence  of  possible  moves  made  by  the  two  procedures  are  concerned. 
Some  of  the  More  Recent  Algorithms 

A  limitation  of  subgradient  optimization  methods  is  the  absence 
of  a  benchmark  to  compare  the  computational  efficiency  of  different 
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algorithms.   Algorithms  for  unconstrained  optimization  of  smooth 
convex  functions  have  the  reliable  method  of  steepest  descent 
(Cauchy   procedure).   as  a  yardstick  against  which  other  algorithms 
are  compared.   If  they  are  better  than  the  Cauchy  procedure,  they 
usually  deserve  further  consideration.   In  subgradient  optimization, 
the  Cauchy  procedure  may  not  work.   An  exampJe  v;liere  the  Cauchy 
procedure  is  ineffective  is  reproduced  below  from  [37]: 
The  function  considered  is 

max  {3  X  +  2y,  2x  +  lOy} 


FIGUI^  19 


The  contours  of  this  function  are  sketched  in  Figure  19.   One 
can  calculate  that  if  the  Cauchy  algorithm  is  started  at  z,  the 
sequence  converges  to  the  origin  (0,0)  which  is  not  the  minimum 
of  the  function. 
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Some  of  the  recent  subgradient  procedures  can  be  divided  into 
two  categories.   In  the  first  category  are  those  that  claim  to  be 
at  least  as  good  as  some  other  subgradient  procedures.   Tliese 
algorithms  claim  their  superiority  based  on  empirical  (computational) 
results.   In  the  second  category  are  those  that  claim  to  be  reasonably 
effective  for  both  dif f erentiable  and  non-diff erntiable  convex 
functions  and  claim  that  the  algorithm  is  superior  when  applied  to 
quadratic  and  convex  dif f erentiable  functions. 

As  an  example  of  the  former  we  present  the  method  of  Camerini, 
Fratta  and  Maffioli  15],  and  as  an  example  of  the  latter  we  present 
Wolfe's  [38]  method  of  conjugate  subgradients  for  minimizing  non- 
diff  erentiable  functions.   Wolfe's  algorithm  closely  resembles 
Lemarchel's  method  128]. 

Camerini,  Fratta  and  Maffioli.   Here  one  adopts  the  following 

iterative  scheme  for  minimizing  f(').   Let 

X  =  0  and 

n+1    n      n,      n.       ,.^.,tr    ,.    n 
X    =  X  -  t  s  where  s   is  a  modified   gradient 
n 

direction  defined  as 

s    =0  for  n  =  0  and 

n    c\  f   ^\     ,    n      n-1 
s   =f(x)+3s 
n 

with  f ' (x  )  a  subgradient  of  f  at  x  and  3   a  suitable  scalar.   s 

n 

is   equivalent    to   a  weighted   sura  of   preceding  gradient   directions    and 
is    used   as   an    anti-jamming   device.      The    following    lemmas   and 
theorems   develop    the  essential   elements   of   Camerini,    Fratta   and 
Maffioli   method. 
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(2.23)      Lemma 


*  n 

Let   X     and  x     be    such    that 


f(x   )    ^   f(x   ),    then 


0  S   f  (x")  -  f  (x*)  =  f '  (x"")'  (x*  -  x") 
This  lemma  states  that  the  negative  of  the  subgradient  makes  an 
acute  angle  with  the  vector  (x  -  x  )  where  x  corresponds  to  an 
optimal  solution.   See  Figure  20. 


/  X 


f  (x") 


FIGURE  20 


-f  (x") 


(2.2A)   Lemma 

If  for  all  n 


0  <  t   <  1^^-  ^(^  )  -^^  o      > 


n,,  2 


and  3   =0,  then 
n 


'    A 


0  ^  f  (x  )  (x  -  X  )  ^  -s   (x  -  X  ) 
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Lemma   1.1k   extends    the  subgradient   property   of   Lerama   2.23    to  -s 

-s"  also  makes  an  acute  angle  with  (x  -  x  ). 

(2.25)   Theorem 

t 
*■     n-l,,n,        ii,, 

r-Y,  ^ — Lix_  if  3^-1  f  (x")  <  0 

Let  6^=1       II  s"-^  II  2 
\^  othervjise 

with  0=1  =2:   then 
n 


-(x  -  x")  s    >  _  (x   -  X  )  f  (x  ) 


lls^ll 


If'Cx") 


Theorem  2.25  shows  that  a  proper  choice  of  6   ensures  that  -s   is  at 

least  as  good  a  direction  as  -f (x  )  in  the  sense  that  -s   makes 

more  of  an  acute  angle  with  (x  -  x  )  compared  to  -f ' (x  ).   If 

-f'(x")  and  -s"""*"  make  an  acute  angle  we  take  -s^   =  -f'(x  ).   Only 

if  -f'(x")  and  -s"        make  an  obtuse  angle  do  we  take 

-s   =  -(f '  (x  )  +  3  s    ) 

n 


Figure  21  illustrates  the  above  theorem. 


n-1 


-f  (x  )  =  -s 


n-1 


FIGURE  21 
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(2.26)      Ttieorem 

n+1 


II     <    II  X*-    x" 


Proof: 


""  l|s"l|' 

C^lj  s"||  ^  g    (f(x")   -  f(x*))    <    2[f(x")   -    f(x--v)] 

^  -2f'(x    )  (x'<   -    X   )    by   Lemma    2.24 

n  '  n 

S  -2s      Cx*  -   X   )    by    Lemma    2.25 


t^||s"||2   +   2s"' (X*  -    x")    <    0 


or  t^||s"||  2   +  2t   s"'(x*  -   x")    <   0 

n  n 

adding    ||  x*  -   x    ||       to   both   sides   gives 


Hence 


n+1  I  I        .      1 1        ,  n  1 1  r-ry-j 

X"   -    X         II     <    II  x-*   -   X    II  I/// 


Theorem   2.26    shows    that    the   sequence    is   Fejer-monotone  with    respect 
to   an  optimum   point   x*.       Using  arguments    similar    to    those    used 

earlier   in   relaxation  methods,    it   can   be   shown    that    the    sequence 

r  n-i 

ix  ;  converges  to  x'^'. 

Camerini  et  al . ,  use  heauristic  arguments  to  show  that  the 
best  value  for  Y  is  1.5.   They  then  claim  superiority  of  their 
method  over  that  of  Held  and  Karp  [18]  and  Polyak  [35]  based  on 
computational  results. 

Wolfe.  In  Wolfe's  method  [38]  we  generate  a  sequence  of  points 
{x  }  of  directions  {d  },  of  subgradients  {f'(x  )}  and  scalars  {t"} 
such  that  for  n  =  0,  1,  2,    ... 
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t  ^  0,  f  (x"")  e  3f(x") 
n 

n+1    n  ,    .n    , 
X    =  X  +  t  a   and 
n 

^,   n+1,  ^  (T/  n, 
f (x   )  <  f (x  ). 

At  step  n  we  have  a  bundle  of  vectors  G  which  is  formed  by  a  set  of 

rules.   G  consists  of  f ' (x  ),  possibly  -d    and  a  (possibly  empty) 

string  f'(x    ),  f'(x   ) ,  . . .  of  limited  length. 

We  set  d   =  -N  G  where  N  G   is  a  unique  point  in  the  convex 
r  n        r  n 

hull  of  G  having  minimum  norm.   Demjanov  I 10]  has  shown  that 

Vf (x)  =  -N  8f (x)  where  Vf(x)  is  the  gradient  of  f  at  x  when  f  is 

smooth  at  x  and  3f(x)  is  the  subdif ferential  of  f  at  x.   As  is 

typically  the  case,  the  stepsize,  t  ,  is  determined  by  an  approximate 

minimization  of  f(x  +  td  )  for  t  =  0.   At  some  step,  d   =  -N  G 

n     r  n 

will  be   small,    and    in   addition    the    last    p  +  1   points    (for   some 
interger   p)    are    all    close    together.       Tlien   x     will   be   our   approximate 
solution. 

For  example   consider    the   illustration   in  Wolfe's   paper 
f(x,y)    =   max   {f    ,    f    ,    f    }     whei 


;re 


fj^  =  -X,  f^  =  X  +  y,  f^  =  x  -  2y 
sketched  in  Figure  22.   A  subgradient  of  f  at  any  (x,y)  is  chosen  to 
be  Vf .  where  i  is  the  least  index  for  which  f(x,y)  =  f..   If  we  start 
with  the  point  (6,3),  we  move  in  the  direction  -Vf(6,3)  =  (-1,-1)  to 
the  point  (3,0).   Using  our  prescription  for  choosing  the  direction, 
the  only  available  direction  at  (3,0)  is  still  (-1,-1).   We  must 
take  a  small  step  in  that  direction  anyway  even  thougli  it  is  not  a 
descent  direction.   Tlien  the  new  subgradient  will  be  (-1,-2)  forming 
the  bundle  G  =  {(1,1),  (1,-2)},  we  find  the  direction  -N  Q     =  (-1,0). 
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FIGURE  22 


The  next  step  in  this  direction  takes  us  to  the  neighborhood 

of  (0,0).   At  the  next  iteration  we  add  (-1,0)  to  the  bundle.   Then 

N^G^  =  0.   We  "reset"  and  start  all  over  again  at  this  point.   We 

accept  the  last  point  reached  as  a  solution  if  N  G   =  0  and  the 

r  n 

last  "p"  points  are  "close  enough." 

For  differentiable  convex  functions,  the  method  is  an  extension 
of  the  method  of  conjugate  gradients  and  terminates  for  quadratic 
functions. 


CHAPTER  3 
THEORETICAL  RESULTS 

Comparison  of  Generalized  Merzlyakov  Method  and 
Some  Typical  Subgradient  Methods 

The  equivalence  of  relaxation  methods  and  minimization  methods 
has  been  noted  by  several  researchers.   However,  these  two  approaches 
are  only  mathematically  equivalent  and  this  equivalence  has  not  been 
generally  made  explicit.   On  the  other  hand  computationally  relaxa- 
tion methods  are  much  simpler.   It  has  been  shown  [24]  that  a 
generalized  form  of  Merzlyakov' s  method  subsumes  most  of  the  typical 
subgradient  methods.   In  particular  it  was  shown  that  the  generalized 
Merzlyakov  method  can  generate  any  move  that  is  made  by  the  Oettli 
method.   In  this  section  we  draw  heavily  from  the  results  of  the 
above  paper  [24]. 
Generalized  Merzlyakov  Method 

The  Merzlyakov  method  is  generalized  [24]  so  as  to  facilitate 
its  comparison  with  some  of  the  subgradient  methods.   This  extension 
greatly  increases  the  flexibility  of  the  original  method.   The 
following  procedure  allows  the  A.s  to  be  chosen  based  on  the  current 
point  rather  tlian  being  constant  for  a  subcavity.   For  any  x  where 
V(x)  /  0,  let 

/"o  if  a^x  +  b.  >  0 

A.(x)  =  ^       ""      "" 
^       I  >  0  if  a'.x  +  b.  ^  0 
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where  0   <   Y.   A.(x)  =  3,  A .  (x)  >  0  for  some  i  cV(x)  and  aL  each  n 
1  1 

there  is  a  j  e  V(x  )  such  that 
^.(x")(a'.x"  +  b.) 

3.1  J- ^ ^  >  e 

a '.  .  X  +  b  .  . 
J*      J* 

where  j*  indexes  a  most  violated  constraint  and  0  >  0. 

For  A  e  (0,  2)  and  x  initially  specified,  stop  at  iteration  n 

if  X  e  P.   Otherwise,  let 

XL   A.(x")(a:x"  +  b.)(Z  A.(x")a.) 

T  „        n+1    n   i_     1      1     1     X 

3.2       X    =  X  


(E  A.(x")a.)'(Z  A.(x")a.) 
11      11 


We  continue  to  assume   a.    =1  for  all  i. 

1 

In    the   next    two    theorems,   we   show    that   the   generalized 
Merzlyakov  method  converges  with   a   linear  rate   of  convergence    to   P. 
3. 3      Tlieorera 

Let   z   e   P,    and    {x    }    obtained   by    the   generalized   Merzlyakov 
method.      Then 

II  x"-*-!-    z||     <    ||x"-    z|| 
Proof: 

2(x"-z)'    A   I   A.(x")(a:x"  +  b.)E   A.(x")a. 


||x"^^-z||=l|x"-z||^-- 


(Z   A.(x")a.)'(Z   A.(x")a.) 
11  11 


A^(Z  A.  (x'')(a'.x''  +  b.))^ 

+  i ~ 

(Z   A.  (x'')a.)'(Z.  A.(x")a.) 


The   last    two    terms    can   be    conAiined    to    get 
-A   Z   A.(x'^)(a'.x"   +  b.) 


[2Z   A.(x")a:(x"   -    z) 


a   A^(x")a^)'(Z   A^(x")a^) 


■A(Z  A.(x")(a'.x"  +  b.))] 
Ill 
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Now  a!  X     +  b.    =   0  whenever   A.(x   )    >   0   and   a !  x"^  +  b      <   0   for   some 
11  1  11 

A.(x  )  >  0  and  thus  the  first  bracketed  term  is  positive. 

Since  z  G  P  and  alz  ^  b.,  then 
1   ~   1 

-Z  A.(x")a'.z  ^  E  A.(x")b. 
11       11 

Thus    the   second   bracketed    term   is    less    than    or   equal    to 

(2  -A)    Z   A.(x")(a:x''  +  b.) 
Ill 

and    this    term   is   strictly  negative. 
Thus 

II     "+1  II     ^11     n 

||x  -z||     ^||x     -2 

Thus  we  have  established  that  the  sequence  is  Fej er-monotone 

with  respect  to  P.   We  next  show  that  the  procedure  gives  points 

which  converge  linearly  to  P.   Let 

d(x^,    P)    =    inf    II  x"  -    zjl  . 
zeP 

3.4      Ilieorem 

Either    the  generalized  Merzlyakov   procedure    terminates  with   a 
point   of   P   or    it    generates   a   sequence  which    converges    to   a  point 
X*  e  P. 

Furthermore,    the   convergence    rate    is    linear. 
Proof:      Assume   x     ?!    P  and    z"   is    the    unique    closest   point    of    P    to   x". 
Then 

d^x"-"\    P)    ^    ||x"+l-    z"||2. 
After   ex-panding    ||  x"        -    2*^11       we    get 

(i:   A    (x'')a.)'(Z   A.(x")a.) 
11  11 

.      „  A(2-A)[Z  A.(x")(a:x"  +  b.)]^ 

=  d-^Cx",    P)  -- 


(Z   A.(x")a.)'(Z   A.(x")a.) 


6  5 


By  assumption 

IE  A.(x")(a:x"  +  b.)]^  k    [A.(x")(a'.x'^  +  b.)]^ 


^  e^(a!^x"  +  b.^^)^  for  some  i  e  V(x") 


From  Hoffman  [21]  we  have  that  there  is  a  p  >  0 
n  r.s    <      ^,   n 


111  us 


pd(x    ,    P)    ^  -a!..x'   -  b..,. 


d^x"^\    P)    <  d2(x^    p)_A(,2-A)p^9V(x",    P) 


(L   A.(x")a.)'(E   A.(x")a.) 
11  11 


Finally   since   Z   A.(x   )    ^   g  and    ||a.||=   1   for   all    i,   we   have 


a  A.(x")a.)'(Z   A.(x")a.)    <   B^ 


And  we    get 

d2(x"^\    P)    <    U  -   A(2-A)e2y2/3^]d2(x",    P)    =  y^d^Cx",    P) 

Tlien   0   ^  Y  <   1   and   d(x",    P)   i  Y"d(x°,    P) . 

Case    1:      If    tlie  sequence   does   not   terminate    lim  d(x",    P)    =   0   and 

n-x» 

Motzkin   and   Shoenberg    [31]   have   shown    that    the   sequence   converges    to 

a   point   X*   in    the  boundary   of   P    for   0   <    A   <   2. 

Case    2:      If    the   sequence    terminates 

II  x""*"     -    z||     <    II  x"   -    zjl     for  all    z   e    P    (Theorem   3.3). 
Also    II  X*  -    z    II     <    II  X     -    2    II     where    z"    is    the    closest   point    to   P 
from  X    . 
Thus 

||x-.'c   -    x"||     ^    II  X*   -    z"||  +  ||z"   -  x"|| 

<    2||x"   -    z'^ll     =    2d(x",    P) 

^  2Y"d(x°,  P) 
where  x*  is  the  last  point  in  the  terminating  sequence. 


/// 


66 

We  can  further  generalize  the  Kerzlyakov  method  and  Tlieorems  3.3  and 
3.4  by  replacing  A  with  A  where  0  <  b.  <  A   <  b   <  2  for  all  n.   In 
the  remainder  of  the  discussion  we  usually  mean  A   =  A  for  all  n  but 
do  not  have  to  restrict  ourselves  in  this  manner. 
Further  Developments  of  Oettli  Method 

Oettli's  minimization  method  was  discussed  in  the  last  chapter. 
We  wish  to  show  the  equivalence  of  Generalized  Merzlyakov  method  and 
Oettli's  method  in  the  next  section  but  before  that  we  need  to  develop 
some  properties  of  Oettli's  method  to  facilitate  this  comparison. 
Again  most  of  the  theorems  and  proofs  in  this  section  are  taken  from 
the  paper  [24]  referred  to  earlier. 

We  represent  the  subdifferential  of  a  convex  function  f  at  x  by  ' 
3f(x)and  a  subgradient  of  f  at  x  by  f ' (x)  i.e.,  f ' (x)  e  3f(x). 

To  recapitulate  the  Oettli  procedure,  we  form  the  sequence  {x"} 
as  follows: 

3.6       ,,"+!  -  ,,n   d(x")t(x") 

t(x")'t(x^) 

where    t(x   )    is    any   subgradient  of   d(«)    at   x". 

3.  7      Lenmia 

For  any  norm  p(«)  on  R™ 

ap(x)  =  9p(0)  n  {p'(x):  p(x)  =  p'(x)'x} 
Proof: 

We  will  show  that  if  p' (x)  c3p  (x)  then  the  following  ralation  holds 
9p(x)  =  {p'(x):  p(y)  2  p'(x)'y  for  all  y}  n 
^p'(x):  p(x)  =  p'(x)'x} 
Conversely  if  the  above  relation  holds,  we  will  show  p' (x)  e    3p(x). 
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By  the  definition  of  a  subgradient  p'(x)  e  3p(x)  if 

p(y)  2  p(x)  +  p'(x)'(y  -  x)  for  all  y. 
Also 

p(Ax)  -  p(x)  >  p'(x)'(Ax  -  x)  for  all  scalars  A. 
Since  p(*)  is  a  norm 

p(Ax)  =  |A|p(x) 
For  A  >  1     p(x)  =  p' (x)'x 

0  <  A  <  1     p(x)  i   p' (x)'x 
Thus  p(x)  =  p' (x)'x. 
From  the  subgradient  inequality 

p(y)  ^  p(x)  +  p'(x)'(y  -  x)  by  substituting  for  p(x)  we  get 

p(y)  >   p'(x)'y  for  all  y.   Hence 

3p(x)  =  {p'(x):  p(y)  >  p'(x)'y  for  all  y}  n 
{p'(x):  p(x)  =  p'(x)'x}. 
Conversely,  if  p(y)  >  p'(x)'y,  for  all  y  and  p(x)  =  p'(x)'x,  we  get 
by  subtraction 

p(y)  -  P(x)  ^  p'(x)'y  -  p(x)  =  p'(x)(y  -  x) 
and  p' (x)  e  dp (x) . 
Finally 

8p(0)  =  {p'(0):  p(y)  ^  p'(0)'y,  for  all  y}  n  r'" 


{p'(0):  p(y)  >   p'(0)'y,  for  all  y}.  Ul 


To  find  the  subgradients  of  d(«)  in  terms  of  subgradients  of  the 
composite  function  p(?,  (x))  we  use  the  following  chain  rule  given  by 
Oettli. 

3.8  Lemma 

Let  p(*)  be  an  isotone  norm  and  p'C^.  (z))  2  0,  then 

3.9  d'(z)  =£"^'(2)  p    (l^iz)) 
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Proof: 


l^{z)   -   £^(z°)  =  Cz  -  z°)'il^'cz°) 


multiplying  both  sides  by  p ' (£  (x))  where  p ' (A  (x) )  =  0 

(^^(z)  -  £^(z°))p'(ll^(z°))  I    {7.  -    z°).'Ji"*''(z°)p'(£^(z°)) 

Using   the   definition   of   subgradient   for    p'(i2-    (z    )) 

p(Jl'^(z))   -   p(£'*'(z°))    ^    (!i^(z)    -   £"*'(z°))'p'(ii^(z°) 

^    (z  -    z°)'£^'(z°)p'(il+(z°)) 
d(z)    -   d(z°)    =    (z  -   z°)'£^'  (z'^)p'(£"^(z°)). 

But   d(z)   -    d(z°)    ^    (z  -    z°)'d'(z'^) 

hence     d'(z°)    =   £^   (z°)p  '  (£"''(z°) )  . 

The   subgradient   of   5-.(x)    may  be   found    using    tlie    following  well 

known    result 

3.10  35,. "'"(x)   =    {-Aa.:    A  =  0   if  a'.x  +  b.    >  0 

1  1  11 

A   =   lifa'.  x  +  b.<0   and 
1  1 

A  c    [0,1]    if   a'.x  +  b.    =0} 

The  subdiff erential   of    £    (x)    can   in    turn  be    found    from 

3.11  3i2,"^(x)   =    {(h,,    ...,   h   ):    h.   e    3K,"*"(x)} 

1  mil 

Generalized  Merzlyakov  method  and  Oettli  method 

With  the  results  developed  so  far  in  the  previous  sections  we 
are  now  in  a  position  to  develop  an  expression  for  equation  3.6  in 
terras    of    the   parameters    of    the   generalized  Merzlyakov  method. 

3.12  Theorem 

Let    X  r.    K    ,    x  ?!    P   and    t(x)    be    a   subgradient   of    d(*)    at   x   formed 

by  using  the  chain  rule  of  equation  3.9.   Then 

,,  .  ,  .     [ZA. (alx  +  b.)]ZA.a. 
d(x)  t(x)   ^    1   1 1     11 

t(x)'t(x)    (ZA.a.)' (EA.a.) 
11     11 
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for  some  A.'s  where  A.  =0,  A.  =0ifalx+b.>0,  and  ZA   >  0, 
1  11         1     1  i 

Proof: 

By  equation  3.9  we  have 

t(x)  =  l^   (x)p'(£"*'(x))  for  some  l^   (x)  c  3£"'"(x) 
and  p'(Jl"^(x))  c  3p(il"'"(x)). 

By   equations    3.10   and    3.11 

+' 
^      (x)    =  -(n,a    ,    r]r.a    ,    ..,,    n  a    )   with 
i   i        I    A  mm 

0    if   a!x  +  b.    >   0 
1  1 


n.    =  <  1   if  a'.x  +  b.    <  0 


G   [0,1]    if   a! x  +  b.    =   0 


Also   from   the   proof  of   Lemma   3.7 


d(x)   =  p'  {I   (x))'il   (x)   where 
/n^Ca^x  +  b.) \ 


l^U)  = 


\ 


n  (a'x  +  b    ) 
mm  m 


/ 


Let 


,+ 


p' (£    (x)).      In  Oettli's  method  we   need  h   ^   0.      Substituting 


^u  1  c  •  •  d(x)t(x) 

these   last   rew   expressions   into  — -, — rn — i—:-  gives 
^  t(x)'t(x)    ^ 


d(x)t(x) 
t(x)'t(x) 


[ZA.Ca'.x  +  b.)]ZA.a. 
1      1 1  11 

(ZA.a.)'(EA.a.) 
11  11 


,  +  , 


with   A.    =    n.li.    ^   0.      We   also   have  -ZA.(a!x  +  b.)    =   p(£    (x))    >   0   since 
111  111^ 


X  t    ?   and   ZA.    >   0. 

1 


m 


We  have  thus  established  a  close  similarity  between  the  two 
methods.   We  have  however  yet  to  establish  the  two  other  requirements 
viz  ZA .  ^  6  and  that  condition  3.1  is  satisfied.   First  we  note  that 
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since  the  subdif ferential  of  p(*)  is  compact  ZA.  is  bounded  above. 

Hence  the  condition  EA.  ;^  6  is  satisfied.   Now  to  show  that  condition 

1 

3.1  is  satisfied,  we  see  on  the  lines  of  the  proof  of  Theorem  3.12 

that 

d(x)    =  p(l^(x))    =  p'(£'^(x))'£"^(x)  =   h'(£^(x)) 

=  -i:ri.h.(a'.x  +  b.)  =  -ZA,  (x)(a'.x  +  b.) 
Ill     1       1     1     1 

where  A. (x)  =  n .h .  ^  0. 
1       11 

Hence 

-ZA.  (x)(a:x  +  b.)  =  p(JL'^(x)) 
111 

and    m  max  A.(x)(-a'.  x  -  b.)  >  pCS,  (x)) 
i 

Also      i    (x)  ^  £.,(x)e,.     where  i*  is  the  most  violated 
constraint  and  e.^  is  a  unit  vector  with  a  1  in  position  j*.   Since 


p(*)  is  monotonic 


P(^'^(x))  >  P(£^.(x)e..) 
2^  J* 


=  -(a:  X  +  b.  .)p(e... 


) 


Thus 

m  max  A .  (x)  (-a.'x  -b.)^-(a'x  +  b  ,)p(e  ,) 
_^   1      1     1       2*  J'-'    j'' 

1   -  (a !  ,  X  +  b  .  .  )  min  p  (e  . ) 


Hence 

A  rx)(a:x+b.)  ,  "^^"  P(^) 

where  i  e  V(x)  maximizes  the  left  hand  side. 

If  we  allow  A  =  1  in  the  generalized  Merzlyakov  method,  we  find 
from  the  above  analysis  that  any  move  made  by  the  Oettli  procedure 
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can  also  be  made  by  Che  generalized  Merzlyakov  method.   However,  the 
reverse  is  not  true.   This  can  be  seen  if  we  let  A  7^  1  and  take  the 
case  where  x  violates  only  one  constraint.   llius  the  Oettli  procedure 
is  strictly  subsumed  by  the  generalized  Merzlyakov  method.   Compu- 
tationally  the  relaxation  method  is  very  simple  to  implement  since 
the  directions  of  movement  can  be  easily  computed.   Also  it  is  a 
very  flexible  procedure  since  all  that  is  required  to  change  the 
direction  of  movement  and  the  step  length  is  to  change  the  weights 
A.(x).   On  the  other  hand  the  Oettli  method  requires  a  suberadient 

1  TO 

to  be  computed  at  each  point  which  is  not  a  trivial  calculation. 

In  an  earlier  paragraph  we  indicated  that  the 

Merzlyakov  procedure  could  be  further  generalized  by  allowing  X    to 

vary  with  each  iteration.   However  this  would  require  tiie  additional 

condition  0<b.^A   =b„<2.   Oettli  has  recently  generalized  his 
X    n    2 

mjethod  [33]  to  incorporate  the  above  feature.   His  generalized 

method  requires  that  A  c  (0,  2)  for  all  n  with  the  additional 

n 

stipulation 

ZA  (2  -  A  )  =  +  ~. 
n      n 

A  second  aspect  in  which  the  two  methods  differ  is  that  in  the 

generalized  Merzlyakov  method  we  could  use  a  different  set  of  A.'s  at 

each  iteration.   This  could  be  introduced  into  the  Oettli  method  by 

considering  different  norms  at  eacl;  iteration. 

Comparison  With  Some  Other  Subgradient  Methods 

In  this  section  we  show  the  versatility  of  the  generalized 

Merzlyakov  method  by  showing  that  some  of  the  other  typical  subgradient 

procedures  are  strictly  subsumed  by  the  generalized  Merzlyakov  method. 
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We  consider  the  methods  of  Polyak  ([34],  [35]),  Eremin  [12]  and  Held, 

Wolfe  and  Crowder  [19].   These  subgradient  methods  are  capable  of 

tackling  more  general  functions,  however  we  compare  them  when  the 

objective  is  merely  to  find  a  point  of  a  polyhedron  P.   There  are  a 

number  of  ways  of  defining  the  function  f(*)  such  that  when  we 

minimize  it  we  get  x*  e  P.   A  typical  choice  is 

3.13      f(x")  =  max  -(a]x^   +   b.). 

1      1 

i=l, . . . ,m 
We  will  show  that  with  f  thus  defined  the  subgradient  methods  are 
subsumed  by  the  generalized  Merzlyakov  method.   The  above  subgradient 
methods  can  be  collectively  described  by  the  following  sequence: 

n+1    n    ,  [f(x  )  -  f]    ,  ,  n, 
^    =  ^  -  \      ||f'(x")P  ^  ^^  ) 

where    f   is   an   estimate   of   fCx*)    the    optimum  value    of   f.      If    f(x    ) 

assumes    its   value    for   a   unique    i    in    the   relation    (3.13)    then    the 

subgradient   at   x    ,    f ' (x   )    =  a . .      If    there   are    ties,    f'(x    )    can  be 

taken    to  be  a   convex   combination  of    the  maximizing   a.s.      Let    0(x   ) 

represent    the   set   of   indices    i   C    {l,     ...,    m}    for  which 

fCx"^)    =   max    (-a'.x^  -   b.).      Then 
i=l, . . . ,m 

f '  (x")    =   ZA.(x")a.      with   ZA.Cx"^)    =   1 
11  1 

A_.(x'')    I  0   and   A.Cx"")    =   0   for   i  5^   0(x''). 

Hence  _ 

,,                     1     (f(x")    -    f) 
n+1           n        An  ri  ,    "s 

X  =    X '-   C  (x    ) 

iif(x")r 


=   x     -  A 


(ZA.(x")(a^x"  +  b_^)   -   f)(i;A.(x")a.) 


"    (ZA.(x")a.)'(ZA.(x")a.) 
1111 
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with  X.(x")  >   0,  ZA.(x")  =  1  and  A.(x")  =  0  for  i  ?!  0(x"). 
This  can  be  considered  equivalent  to  finding  a  point  of  P  where 

P=  {x:  a.x+b.  -?^0,  i=  1,  ....  m}. 

11 

If  f  ^  0,  the  procedure  gives  a  point  of  P.   Thus  if  P  ?^  0  the 
subgradient  procedures  mentioned  in  this  section  are  special  cases 
of  the  generalized  Merzlyakov  method. 

An  Algorithm  Using  Relaxation  Method 
Motivation 

We  have  seen  that  with  relaxation  parameter  of  A  =  1  we  can  get 
the  best  convergence  rate  for  a  fixed  y  (see  Figure  lA).   Also  a 
higher  y  leads  to  better  convergence.   We  wish  to  combine  both  these 
features  in  our  algorithm.   We  also  derive  motivation  from  Polyak's 
[34]  and  Oettli's  procedure  [32].   In  this  section  we  give  a  procedure 
for  finding  a  point  of  an  arbitrary  polyhedron  P  and  later  apply  it 
to  the  specific  problem  of  solving  large  linear  programs. 

Given  a  polyhedron  P 

P  =  {x  e  R^:  a'.x  +  b.  ^  0,  i  c  l}. 
1     1 

We  assume  P  ^  0  and  I  f^  0  and  finite.   Goffin  [15]  has  shown  tliat  when 
P  is  full  dimensioned  finite  convergence  is  assured  for  a  range  of 
relaxation  parameters  A  depending  on  the  obtuseness  index  of  the 
polyhedron.   WHien  P  is  not  full  dimensioned,  we  can  partition  it  to 
subsets  M  and  C,  such  that  M  is  full  dimensioned,  and  then  devise  an 
iterative  procedure  relative  to  these  subsets  so  as  to  take  advantage 
of  the  full  dimensionality.   Even  though  finite  convergence  is  not 
assured,  we  can  hope  to  get  a  better  convergence  rate  with  a  proper 
choice  of  the  relaxation  parameters.   There  is  considerable  flexibility 
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in  the  manner  M  and  C  are  chosen.   One  important  consideration  is  that 
it  should  be  easy  to  project  on  to  C.   The  algorithm  is  in  two  phases. 

In  Phase  I  we  use  an  iterative  scheme  on  the  sets 
M  and  C  which  drives  us  close  to  P.   We  do  this  by  starting  with  an 
arbitrary  point  in  C.   We  then  move  in  the  direction  of  M.   This 
movement  could  be  a  projection  onto  M,  or  a  reflection  or  under  or 
over  relaxation,  depending  on  the  value  of  the  relaxation  parameter 
X.   From  the  new  point  thus  obtained  we  project  on  to  C.   We  show 
tliat  the  sequence  thus  constructed  converges  to  P  with  a  linear  rate 
of  convergence.   In  our  algorithm  however,  this  iterative  procedure 
is  continued  merely  to  get  close  to  P.   In  this  special  case  when 
C  =  R  ,  Phase  I  becomes  tlie  relaxation  procedure  of  Agmon. 

Once  we  get  close  to  P  we  switch  to  Pliase  II.   In  Phase  II  we 
consider  the  constraints  of  the  set  P  and  apply  the 
Generalized  Merzlyakov  method.   The  motivation  for  use  of  the 
Generalized  Merzlyakov  method  at  this  stage  is  the  fact  that  the 
set  of  violated  constraints  as  we  approach  P  are  precisely  the  set 
of  violated  constraints  at  the  limit  point  and  under  these 
circumstances,  the  Generalized  Merzlyakov  method  can  be  very 
effective. 

We  first  describe  the  procedure  for  finding  a  point  of  an 
arbitrary  polyhedron  P. 
The  Algorithm 

Phase  I. 

Let  z   £  C-P 

If  z  C  P  stop,  otherwise  define 
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n+1 


=  T^  {z""  +  Ad,j(z")a.  }  =  T^(s") 


where  0  <  A  <  2  and  a.   corresponds  to  the  most  violated  half space 

n 


n 


of  M  at  z   and 
3.14 


n  _   n    ,   ,  n, 
s   :^  z   +  Ad,,(z  )a . 

M      1 


We  will   show    that    the   sequence    { z   }    is   Fejer-monotone  with 
respect    to   P. 
3.15      Lemma    [GoffinJ 

Let   X,    y,    t  be   points   of  R     and   let 

x'    =   X  +  A(t  -  x)   where   A  e   R.      Tlien 


Proof: 


X'    -   y 


X'    -    y| 


-  y||^  -  AC2  -  A)   lit  -   x||  ^  +  2A(t-y)'(t-x) 


X  +  A(t  -   x)   -   yl 


=    ||x-y||       +  I2(x  -   y)   +  A(t   -   x)]'    A(t   -   x) 

=    II  X  -   y||  2   -  A(2   -    A)||  t   -    x||  ^   +  A[2(x  -   y)    +  2  (t-x)  ]  '  ( t-x) 

=    ||x  -   y||^  -  A(2   -   A)||  t  -   xll  2  +  2A(t  -   y)'(t  -   x) .         |/7Z1 

3.16   Lemma 

|ls"-u||2   =  1|,"  -    ul|2-   Ad^(z")[(2-A)dj^j(.")   -   2a!    (t"   -    u)  ] 

n 
for   u  c   M. 

Proof: 

Substitute    s      for   x   ,    z     for   x  and    t      for    t,    where 

t"  =    z"   +  dj^j(z")a.    .      Then 

n 
2         II     n  II  2        ,  ,„         ,  .  II     n  n,|  2 


5"   -    u|r    =    II  z"  -    u|r   -    A(2   -   A) 


t     -  z 


+  2A(t     -   u)' (t      -    z   ) 
z"  -    u||  ^  -   A(2   -   A)d^j(z")   +  2Ad^j(z")a^    ( t"  -   u) 


n 


.n. 


n. 


n 


z     -   u||       -   Adj^(^')I(2  -   A)dj^,(z")    -    2a^    (t"  -   u)  ] .         U± 

n 
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3.17   Tlieorem 

Let  M  be  a  non-empty  polyhedron  of  R  .   Then  the  sequence  of 
points  generated  by  equation  (3.14)  has  the  following  property: 

II  s"  -  u||  ^  II  z^  -  u||  for  all  u  e  M. 
Proof: 

If  u  e  M,  u  e  H_.  .   Hence  a^.  ( t"  -  u)  =0  for  all  u  e   M.   By 
lemma  3.16 


1  1 

n  n 


s"  -  ull  ^  II  z"  -  ull  .  E77] 


3.18      Theorem 

Let   P   be  a   non-empty   polyhedron   of   R    .  Then    the   sequence    of 

points    {z    }    generated  by   equation    (3.14)    is  Fejer-monotone  with 
respect    to   P,    in   fact 

II  z""^^   -    u||     <    II  z"  -    ujl     for   all  uC    P. 
Proof: 

Let    u  e    P   and   let   b   be    the    unique   point   obtained  by   projecting 

u  onto    the    line    formed   by   extending   s    ,    z  (Figure    23) .       The   angle 

formed   by   s    ,    b,    u   is   a    right   angle    and   for  the    right    triangle 


n     , 
s    ,    b ,    u. 


s"  -    a||2  +    Ha  -    ull  2   =    lis"  -    ull  2 


or 


s"   -    z"+l||  2   +    II  z"+l   -   all  2   +  2||  s"  -    z"^^|I   ||  z"+^   -   a] 

+    ||a-    u||2    =    lis"-    u||2 


n  n+l,|  2         II     n+1  ii  2    ,    „  n     n  n+1  n    ,,     n+1 

s      -    z         II       +    II  z  -    u||       +  2||  s      -    /         II    II  z  -    a 

II     '^  l|2 

=        s      -    u 


or 


n+1 
z 


ull'    <     lis"-    u||2. 
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are 


The  strict  inequality  arises  from  the  fact  that  s"  and  z""^ 
distinct  or  else  z    c  P.   But  by  Theorem  (3,17) 
II  s   -  u||  <  II  z   -  u||   hence 

II     ^'^^  II     ^    II     "  II  T— 

II  z  -    ujl     <    II  z      -    u||  JljJ 

In  the  next  theorem  we  show  that  the  sequence  {z"}  converges 
linearly  to  P. 


FIGURE  23 
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3.19   Theorem 

The  sequence  of  points  {z  }  generated  by  the  above  procedure 
converges  finitely  or  infintely  to  a  point  of  P.  Furthermore  the 
convergence  rate  is  linear. 

Proof: 

I  I     n+1        ^    ,    n+1.  1 1  2         II     n+1  1 1  2    .  .  r^    /   , 

\\  z.  -T(z         )||       =||z  ~'J||       f  °r   a   unique   u  e    P    (the 

closest   point   of  z  to   P) .      Thus 


lU'^'-    u||2=    ||T    (z^  +  Ad    rz")a,    }-   T    ru)|| 


M^   '  i      C 


g    II  zT"  -  ujl  ^  -  Ad^j(z")[2a|  (z''-u)-Adj^(z")] 

n 
since  the  projection  map  satisfies  the  Lipschitz  condition  with 

c  =  1.   Continuing 

II     n+1  II  2    ^    II     n  II  2 


u||^   <    II  z"-   u||^-   Ad    rz")[2d,Xz")   -    Ad    rz")] 


since       al  (z  -  u)  =  d  „(z  )   hence 

1  h 

n 

II  ^"^^  -  u||  ^  <  d^Cz"",  P)[l  -  A(2  -  A)y"^]  by  Lemma  2.12 


We  get 


)'d2(z",  P). 


dCz""*""*",  P)  <  ed(z",  P)  where 


e  =  U  -  A(2  -  X)\i-''^]^^^ 
and  0  £  [0,  1). 
Hence 

d(z  ,  P)  <  9  d(z  ,  P)  and  if  the  sequence  is  infinite 

lim  d(z",P)  =  0 
n— Kjo 

Using  Theorem  [2.4]  and  by  the  continuity  of  the  distance  function 
this  implies  {z  }  converges  to  Z'^   in  the  boundary  of  P. 
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If   the   sequence   terminates    at   z''',    then 

II  z>''  -   Tp(z")||     <    II  z"  -  Tp(z")||     =   d(z",    P) 

by  Fejer-monotonicity  of  the  sequence.   Hence  z*  and  z  both  belong 

to  the  ball  B       (T^Cz")),  and 
d(z",  P) 

nn  ^oj/H   „.  ^  „„n ,  .  0 


z'V  -  z"||  <  2d(z",  P)  <  20"d(z^,  P).  Ul 


Phase  II.   When  the  last  k  points  are  within  £  of  each  other, 
we  switch  to  the  Generalized  Merzlyakov  procedure.   This  is  because 
if  we  are  near  the  limit  point,  tlie  only  active  constraints  would 
be  the  tight  constraints  at  the  limit  point  of  the  sequence.   In 
such  a  situation  the  Generalized  Merzlyakov  method  can  be  very 
effective  with  a  proper  choice  of  A  and  A.(x).   We  recapitulate 
that  in  Merzlyakov's  procedure  (which  is  a  special  case  of 
Generalized  Merzlyakov  method)  the  sequence  is  generated  as  follows: 
^^^    ^   XIEA^.(j,  V)(a;.x"  +  bJ][ZA^.(j,  V)aJ 


X     =  X  - 


[ZA.(j,  V)a.]  [ZA.(j,  V)a.] 


where  x  £  S^ 


If  we  assume  that  we  know  the  set  of  constraints  satisfied 
with  equality  by  T  (x  ),  we  can  generate  the  halfspace 

H*  =  {x  e  R'':(Tp(x")  -  x")'(x  -  TpCx""))  >  0} 
which  gives  convergence  to  T  (x  )  in  exactly  one  step  with  A  =  1 
Suppose  we  number  the  set  of  active  constraints  at  T  (x  )  by 
I*  =  {l,  ...,  p}.   Let  A*  E  (a.,...,  a  )'  be  the  p  x  k  matrix 
formed  by  interior  normals  of  the  halfspaces  11.  for  ic  I*.   Let 
b*  =  (b^,  ....  b^)'  .   Define 

(A^,  ...,A  P)'  =  -(A*A*')"^(A*x"  +b*). 
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With  this  set  of  A .  ( j ,  V) ' s  using  Merzlyakov  method,  Goffin  [15] 

has  shown  that  convergence  to  T  (x  )  is  obtained  in  one  step. 

P 

However,  this  requires  a  comparably  larger  amount  of  computation 
and  finding  (A*A*')    is  not  really  practical  on  large  problems. 
In  this  study  we  have  therefore  concentrated  on  the  Generalized 
Merzlyakov  Method  and  attempted  to  find  what  would  constitute  a 
favorable  set  for  X  and  A . (x) . 
Application  to  Solving  Large  Linear  Programs 

We  now  consider  how  the  procedure  could  be  used  to  solve 
large  linear  programs. 
Consider  the  LP 
max  c ' X 
s.  t.  Ax  S  b 
X  ^  0 
and  its  dual 

min  TT'b 

s  .  t.   A'tt  =  c 

TT  ^  0. 

By  stong   duality,    an   optimal  IT   and  x  satisfy 

b'TT-c'x<0.      Let 

C  =  {(,'^):    b'^  -    c'x  ^  0} 

M  =  {(^):   Ax  ^  b,    -A'tt  ^  -c,    x  ^   0,    tt   >   O} 
and  P  =  M  n    C. 

Tliis  is  only  one  of  the  ways  of  partitioning  the  constraint  of 
P.   There  is  in  fact  great  flexibility  in  the  choice  of  C  and  M. 
The  advantage  of  splitting  C  and  M  is  indicated  above  is 
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(a)  M  has  a  special  structure  and  can  be  decomposed  along  TT 

and  X.   Tills  leads  to  saving  in  storage  as  well  as  easier  arithmetic. 

(b)  C  contains  only  the  coupling  constraints  and  is  easy  to 
project  to. 

However,  there  are  other  ways  of  obtaining  p.  Another 
choice  could  be  to  let  C  have  only  the  non-negativity  constraints. 
Again  the  advantage  of  such  a  construction  would  be  ease  of  pro- 
jection to  C.   Still  another  choice  is  for  C  to  take  the  form 
C  =  {(^):  b'TT  -  c'x  ^  0,  X  ^  0,  7T  ^  0}. 

There  is  user  discrimination  in  this  algorithm  in  the  choice 
of  M  and  C,  A  and  Z  in  Phase  I,  and  A  and  weights  A.(x)  in  Phase 
II.   We  have  already  commented  on  the  possible  strategies  for 
selecting  M  and  C.   We  give  below  strategies  for  selecting  the 
other  parameters.   We  will  first  discuss  the  parameters  for  Phase  I. 

(1)  A^:  If  0  <  A  ^  1  convergence  may  be  slow  if  some  of  the 
"angles"  of  M  are  very  acute  (small  obtuseness  index  of  P) .   In 
this  case  increasing  A  to  near  2  will  have  tlie  effect  of  opening 
the  angles  and  accelerating  the  convergence.   It  therefore  appears 
to  be  a  good  strategy  to  have  A  approximately  equal  to  2  in  Phase  I. 

(2)  _Z  :  In  line  with  the  general  observation  for  relaxation 
methods  that  any  a  priori  information  on  the  location  of  the  solution 
could  be  used  to  advantage,  if  we  have  to  solve  a  linear  program 
witli  only  sliglitly  varying  data,  wc  could  take  £idvantage  of  tliis 
feature.  However,  in  general  wlien  we  do  not  have  any  sucli  prior 
information,  the  choice  of  Z  could  be  dictated  by  the  structure  of 
C,  so  that  we  can  start  with  a  feasible  point  of  C. 
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Now  for  the  parameters  of  Phase  II 

(1)  _X:  As  a  general  guideline  1  <  A  <2   may  be  chosen  as  the 
parameter  for  Phase  II  as  was  done  in  Phase  I.   However,  Table  3 
and  Figure  2A  show  the  sequence  obtained  with  A  =  1.5  and  A  =  . 75 
and  illustrate  that  in  some  cases  choice  of  0  <  A  <  1  may  give 
better  results. 

(2)  A.(j,  V):  In  Merzlyakov  method  the  choice  of  weights 

A.(j,  V)  is  crucial  and  in  a  large  measure  dictates  the  convergence 
of  the  sequence.   It  may  be  noted  that  A  .  ( j ,  V)  ' s  determine  both 
the  direction  as  well  as  the  step  size.   Suppose  a.,  i  c  K 
represent  the  set  of  violated  constraints  in  subcavity  S   where 
X  Z    S'}.      We  can  solve  the  quadratic  problem  of  finding  the  point 
of  minimum  norm  in  convex  hull  of  the  finite  point  set  a.,  i £  K 
to  get  the  weights  A . ( j ,  V).   Such  a  direction  locally  has  a  great 
deal  of  merit.   However  the  computational  effort  is  too  great  to 
merit  its  consideration,  since  at  each  iteration  we  have  to  solve 
a  quadratic  program.   Also  it  may  not  always  lead  to  the  best  set 
of  weights  as  shown  by  the  counterexample  given  in  the  next 
paragraph.   Instead  we  specify  A  .  ( j  ,  V)  =|--|,  where  |k|  represents 
the  number  of  constraints  violated  at  x  ,   When  |k|  =  2  the 
direction  obtained  coincides  with  the  vector  of  minimum  norm. 

Tlie  strategy  suggested  by  us  is  not  the  best  in  all  cases  as 
shovvJii  by  the  following  counterexample: 
Consider   x^  1 

> 

X2  =  2 

1    .0. 
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Case  i:  A 


75 


TABLE  3 
A.(j,  V) 


1  for  all  i  £  V. 


n 

X 

ZA.(j,    V)    • 

(a'.x"  +  b.) 
1               1 

^A.(j,    V)n. 

[ZA.Cj,    V)a.]'   • 

n 

n 

n 

^2 

[ZA.(j,    V)a.] 

0 

0.000 

0.000 

-2.873 

-.958^ 

1 

1 

.619 

-2.064 

-1.838 

188 
\037^ 

.037 

2 

7.622 

-    .686 

-    .459 

188 
\037'' 

.037 

3 

9.371 

-    .342 

-    .273 

-.099 
^    .995^ 

1 

A 

9.351 

-    .138 

-    .125 

188 
\037'^ 

.037 

5 

9.827 

-    .044 

-    .032 

188 
\037'' 

.037 

6 

9.949 

-    .020 

-    .010 

-.099 
^    .995^ 

. 

1 

7 

9.948 

-    .013 

-    .008 

188 
\037^ 

.037 

8 

9.978 

-    .007 

-    .002 

-.958^ 

1 

Case  2:  A  =  1.5     A .  ( j ,  V)  =  1  for  all  i  e  V. 


The  problem  considered  is 


n 

X 

2A.(j,    V)    . 

(a'.x"  +  b.) 
1               1 

SA.(j,    V)a. 

[ZA.Cj,    V)a.]' 
[ZA.(j,    V)a.] 

• 

n 

n 

n 

^2 

0 

0.000 

0.000 

-2.873 

.287, 
^-.958'' 

1 

1 

1.237 

-4.128 

-3.235 

,-.099 
^    .995-^ 

1 

2 

.75  7 

.700 

-3.326 

-.958^ 

1 

3 

2.189 

-4.079 

-3.280 

,-.099. 
^    .995-^ 

1 

4 

1.  702 

.816 

-3.166 

-.958-^ 

1 

5 

3.065 

-3.733 

-3.022 

,-.099. 
^    .995'^ 

1 

6 

2.616 

.777 

-2.866 

,    .287 
-.958-^ 

1 

7 

3.850 

-3.347 

-2.710 

,-.099, 
^    .995^ 

1 

8 

3.447 

.704 

-2.558 

,    .287 
-.958'' 

1 

P  =  {(x^):  -.099x1  +  -995x2  +  .995  ^  0, 

.287x1  -  .958x2  -  2.873  ^  0,  x^  ^  0,  x^  ^  O}, 
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2  X 


4        X 


6  X 


A  =   .75 


FIGURE  24 
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From  the  proof  of  Theorem  3.4  we  have 

d2cx"+\  p)  s  d^x^  p) ^ — "^ — i — 

(EA.(x")a.)' aA.(x")a,) 

Thus  to  get  the  best  set  of  A.'s  we  may  solve  the  following 
optimization  problem 

(ZA.(-a!x"  -  b.))^ 

1    1 1    _  , 

max ^  k 

||2A.a.||2 

X* 
Suppose  A*  solves  the  above,  then     i     also  solves  the  above. 


Hence  an  equivalent  problem  is 
max  (ZA  .  (-a*. x  -  b  . ) ) 

s.t.  IIZA.a.II^  =  1. 
In  our  specific  problem 


max  A,  +  2A„ 

2     2 
s.t.  A   +  A   =1 


or  ma 


.JTT 


A^  +  2X2- 


1 
This  gives  A^''  =r"  and  A'^'  =  f~  . 

Tliis  set  of  A.  s  gives 
1 

2     1 
X  =  („)  which  is  feasible  with  k  =  5, 

Had  we  specified  A  =  A  i.e.  ,  A.  =jy- 


which  is  not  feasible  and  k  =  4.5. 


Tlius  our  prescription  does  not  lead  to  the  best  set  of  A.s,  it  is 
however  a  good  lieuristic  and  easy  to  implement. 
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(3)  }i.  (^)  '    In  Generalized  Merzlyakov  method,  the  weights  A,(x) 

are  determined  by  the  point  alone  and  are  not  fixed  for  a  subcavity. 

The  following  appears  to  be  intuitively  a  good  set  of  weights  since 

the  ivreightage  is  dependent  on  the  magnitude  of  violation  of  a 

constraint  at  the  specific  point 

,  ,  .    , violation  of  constraint  i  at  x, 

A  .  (x)  =  ( ; :— :; : "^ ) 

J  total  violation  at  x 

Computational  Results 

The  algorithm  described  in  the  previous  section  was  coded 
primarily  to  come  up  with  a  favorable  set  of  values  for  the 
arbitrary  parameters  A  in  Phase  I  and  X   and  the  weights  X.(x)  for 
Phase  II.   It  may  be  stated  at  the  outset  that  selection  of  these 
parameters  is  an  art  rather  than  a  science  and  it  would  be  hard  to 
specify  the  best  set  of  parameters  for  the  general  case. 

The  test  problem  considered  was  the  problem  on  "Lease-Buy 
Planning  Decisions"  on  pages  93  to  114  of  Salkin  and  Saha's 
"Studies  in  Linear  Programming."   The  problem  has  17  constaints 
and  20  variables.   Together  with  dual  and  non-negativity,  set  M 
thus  had  74  constraints.   Set  C  consisted  of  the  constraint  obtained 
by  using  strong  duality.   This  problem  has  a  unique  solution.   The 
constraints  of  the  problem  are  reproduced  in  Appendix. 

Test  results  corroborate  the  general  guidelines  for  selection  of 
these  parameters  indicated  in  the  previous  section.   The  distance 
from  a  point  oC  P  where  P  =  M  n  C,  was  calculated  at  each  iteration. 
Tlie  number  of  iterations  required  to  reduce  the  distance  from  P  by 
a  certain  amount  was  used  as  the  criterion  of  improvement.   Two 


different  starting  points  were  selected.   Results  of  computations 
are  summarized  below; 
Phase  I 

A_.   The  relaxation  parameter  A  was  varied  in  the  interval  (0,2). 
For  the  first  starting  point,  the  number  of  iterations  required  to 
reduce  the  distance  d  from  P  from  d  =  74  to  d  =  60,  was  noted,  and 
for  the  second  point  the  number  iterations  required  to  reduce  the 
distance  from  d  =  135  to  d  =  132  was  noted.   These  results  are 
summarized  in  Tables  A  and  5  respectively.   The  results  are  in  line 
with  our  expectation  based  on  intuition  that  a  higher  A  has  the  effect 
of  accelerating  convergence. 
Phase  II 

llie  Phase  II  relaxation  parameter  was  varied  in  the  interval  (0,2) 
and  the  following  alternatives  were  tested  for  the  weights  A . (x) : 

(a)A.(x)  ^ 


j  //  of  violated  constraints  at  x 

,  ,  _  violation  of  constraint  j  at  x 

j  total  violation  at  x 

,  ,,  ,  ,  , violation  of  constraint  j  at  x.  2 

(c)A_.Cx)  -(    ^^^^^  violation  at  x       ^ 

.,.-,  ,  .  -violation  of  constraint  i  at  x,  ! 

('^^^j^^)  =  ^   total  violation  at  x        ^ 


4 


en 


,,-,,,    /Violation  of  constraint  .i  at  x^ 

(e)A.(x)  =  ( : — :; : ) 

2  total  violation  at  x 

,^.  -,  ^    , violation  of  constraint  j  at  x.  >   t  ^, 

(f)  If  ( ; 7 :—-, : -T )  =  •  3  th 

total  of  violation  at  x 

,  ,  ,    ,„, .violation  of  constraint  i  at  Xv 

A.(x)    =   10--^( ; r-; : ^ ) 

J  total  violation   at  x 

.violation  of   constraint   j    at   x.    ^      _      , 

if    .2    <    ( .    ,    ^  ■ :: ^ )    <    .3    then 

—  total   violation   at   x 

,     ,    ,         ^  ^    .violation   of   constraint    i    at   x. 

A.(x)    =   5*    ( ; :— ; -. :: ^ ) 

J  total   violation   at   x 


TABLE   4 
Initial   d   =    74 
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X 

Final  d 

No  of  iterations 

.25 

63.757 

>   20,000 

.5 

60.0 

12,955 

.75 

59.999 

7,339 

1.0 

60.0 

4,866 

1.25 

59.997 

3,469 

1.5 

59.997 

2,565 

1.75 

59.996 

1,927 

1.95 

59.997 

1,525 

TABLE   5 
Initial   d   =    135, 
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A 

Final  d 

No.  of  iterations 

.25 

133.113 

>  20,000 

.5 

132.0 

12,698 

.75 

132.0 

6,692 

1.0 

132.0 

4,208 

1.25 

132.0 

2,934 

1.5 

131.999 

2,137 

1.75 

131.999 

1,587 

1.95 

131.999 

1,263 
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,  <  .violation  of  constraint  i  at  x.  ^    «   , 

if  •  1  =  ( .    ,    ^  . ^ )  <  •  2  then 

total  violation  at  x 

,  ,  .    _  „,  -violation  of  constraint  j  at  x. 

A.(x)  =  3.3-'=( ——- r — ■  1  ^  ■ ) 

J  total  violation  at  x 


en 


. r    /Violation  of  constraint  i  at  x,  ^   ,   , 

if  ( : r— ; : "^ )  <  •  1  th 

total  violation  at  x 

,  ,    ,    -violation  of  constraint  i  at  Xv 

A  .  (x)  =  ( ; :— ; -: "^ ) 

J  total  violation  at  x 

,  ,  ^^  -violation  of  constraint  i  at  x^  >   „ 

(g)  If  ( :; :— ; -' )  _  .  3  then 

^  total  violation  at  x 

1  /  \    -, /^^ . /Violation  of  constraint  i  at  x, 

A.(x)  =  100«( :; :— ; : ^ ) 

J  total  violation  at  x 

.^   „  <  /Violation  of  constraint  i  at  x.  ^ 

if  •  2  =  ( r~r~i •  1  ^- T^^ )  <  -3  then 

total  violation  at  x 

,  /  ,    ,_, /Violation  of  constraint  ]    at  x> 

A.(x)  =  10"( :; : — :; : ■^— ) 

J  total  violation  at  x 

,  -  /Violation  of  constraint  i  at  x,  ,  „ 

if    •    1  <    ( : r-; : ^ )  <  .  2    then 

=  total   violation   at   x 

,     /    V         „    .,,  /Violation   of   constraint    i    at   x, 

A  .  (x)  =  3.  3"  ( : — ; : -' ) 

J  total  violation  at  x 


en 


.,  /Violation  of  constraint  i  at  x.  ^ 

if  ( — ' :—: : ^ )  <  .1  th 

total  violation  at  x 

-,  /  >    /Violation  of  constraint  i  at  x, 

A.(x)  =  ( ; : — :, : -^ — ) 

J  total  violation  at  x 

Results  of  alternative  (a)  are  given  in  Tables  6  and  7  for 
the  same  two  points  used  in  Phase  I. 

From  Tables  6  and  7  it  would  appear  that  again  A  of  around  2 
gives  best  convergence,  but  the  counterexample  in  the  previous 
section  shows  ihat  this  is  not  alv^/.•lys  the  best  policy  with  this  set 
of  weights. 

Results  of  alternative  (b)  are  given  in  Table  8. 


TABLE   6 

(a)    ^    (x)    = . 

j  //   of  violated    constraints   at   x' 


Initial   d  =    74 
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A 

Final  d 

No.  of  iterations 

.25 

— 

time  limit  exceeded 

.5 

— 

time  limit  exceeded 

.75 

60.0 

14,839 

1.0 

60.0 

8,028 

1.25 

59.998 

4,702 

1.5 

59.998 

2,861 

1.75 

60.0 

1,424 

1.95 

59.857 

850 

TABLE  7 

(a)  A.(x)  =  .,      ^      .    ^ -^ — 

J       //  of  violated  constraints  at  x" 


Initial  d  =  135 


92 


25 


75 


1.0 


Final  d 


133.45 


132.0 


131.999 


No.  of  iterations 


time  limit  exceeded 


>20,000 


15,249 


7,568 


1.25 


131.999 


4,135 


1.5 


131.999 


2,435 


1.75 


131.998 


1,212 


1.95 


132.0 


713 
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(b)  X   .(x) 
J 


TABLE  8 
violation  of  constraint  j  at  x 
total  violation  at  x 


Intiial  d  =  74 


X 

Final  d 

No.  of  iterations 

.25 

59.999 

1,253 

.5 

59.964 

1,396 

.75 

59.997 

1,505 

1.0 

60.0 

1,084 

1.25 

59.999 

1,254 

1.5 

59.976 

1,061 

1.75 

59.997 

863 

1.95 

59.998 

894 
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Alternative  (b)  appears  comparatively  insensitive  to  the 

relaxation  parameter  X.      Results  of  alternatives  (b),  (c) ,  (d) 

and  (e)  are  tabulated  below: 

TABLE  9 
,,  .   -,  /  N    .violation  of  constraint  i  at  x. 

Alternative  (b)  :  A.(x)  =  ( T~r~i ^~'^ — r^~ — ~~Z ) 

J  total  violation  at  x 

.  .   -,  /  V    , violation  of  constraint  j  at  x  2 

Alternative  (c)  :  A.(x)  =  (    ^    ^    . ^~, — T- Z ) 

J  total  violation  at  x 

.  ,   .    ^        .    .      3 

,,.   ,  /  X    .violation  of  constraint  i  at  x. 

Alternative  (d)  :  A.(x)  =  ( -— .  .  ^  .- — ) 

J  total  violation  at  x 

,  .   -,  /  N    .Violation  or  constraint  i  at  x. 

Alternative  (e)  :  A  .  (x)  =  ( ^  ^  , .  ,  ^  . ;;: ^ ) 

J  total  violation  at  x 

Initial  d  =  74      A  =  .  25 


Iteration 


no.      Alternative   Alternative  Alternative   Alternative 


(b) 

(c) 

(d) 

(e) 

1 

73.881 

73.883 

73.885 

73.886 

2 

73.868 

73.874 

73.878 

73.880 

3 

73.826 

73.836 

73.859 

73.870 

4 

73.812 

73.821 

73.851 

73.860 

5 

73.801 

73.754 

73.824 

73.853 

6 

73.792 

73.747 

73.819 

73.841 

7 

73.786 

73.737 

73.813 

73.836 

8 

73.749 

73.722 

73.807 

73.817 

9 

73.741 

73.718 

73.80  3 

73.813 

10 

73.737 

73.681 

73.673 

73.809 
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Prima-facie  alternative  (c)  and  (d)  looked  comparable  to 
alternative  (b).   So  the  number  of  iterations  required  to  reduce 
d  to  60.0  was  determined  and  is  given  in  Table  10. 

From  Table  10  there  appears  no  significant  difference 
between  alternative  (b),  (c),  and  (d) . 

Results  of  alternatives  (b),  (f),  and  (g)  are  shown  in 
Table  11.   Since  no  significant  difference  in  performance  was 
observed,  (f)  and  (g)  were  not  pursued  any  further. 

Results  of  above  study  show  that  the  following  constitute  a 
favorable  set  of  parameters: 

Phase  I.    X  =  2 

Phase  II.   A  =  2 

■\    f    \    -   violation  of  constraint  j  at  x 
j  total  violation  at  x 

However  as  indicated  earlier,  there  are  simple  counterexamples 
showing  alternative  values  of  the  parameters  which  give  better 
convergence.    Hence  the  above  prescriptions  can  at  best  be 
considered  good  heuristics. 

For  Phase  I  of  above  study,  we  moved  in  the  direction  of 
most  violated  constraint  of  M.   Investigations  were  also  conducted 
combining  Phase  I  and  II.   Instead  of  moving  in  the  direction  of 
the  most  violated  constraint  of  M, we  used  Generalized  Merzlyakov 
method  for  relaxation  with  respect  to  M.   The  results  were  very 
heartening  and  are  shown  in  Table  12.   In  this  table  a  comp;irison 
has  been  made  with  alternative  (b)  of  Phase  II. 
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TABLE  10 
Initial  d  =  74 


A 

Final  d 

//  of  iterations 

Alternative  (b) 

1.95 

59.998 

894 

Alternative  (c) 

1.95 

59.989 

956 

Alternative  (d) 

1.95 

59.982 

1,048 

TABLE  11 
Initial  d  =  74 


A  =  .25 


d 

Iteration 
number 

Alternative 
(b) 

Alternative 
(f) 

Alternative 

(k) 

1 

73.881 

73.883 

73.883 

2 

73.868 

73.875 

73.875 

3 

73.826 

73.861 

73.861 

4 

73.812 

73.844 

73.844 

5 

73.801 

73.836 

73.836 

6 

73.792 

73.831 

73.831 

7 

73.786 

73.809 

73.806 

8 

73.749 

73.803 

73.800 

9 

73.741 

73.797 

73.794 

10 

73.737 

73.794 

73.791 

TABLE  12 
Initial  d  =  74      Final  d  =  60 
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A 

Jt 

of 

iterations 

Phase  I  & 
combinec 

II 

Alternative  (b) 
of  Phase  II 

.1 

1882 

1686 

.25 

740 

1253 

.3 

656 

1590 

.4 

483 

1241 

.45 

381 

1427 

.46 

352 

1354 

.47 

322 

1273 
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Concluding  Remarks 
The  simplex  method  of  solving  linear  programs  has  achieved  its 
present  popularity  in  a  large  measure  due  to  the  advances  during 
the  last  three  decades  on  extending  the  effectiveness  of  the  basic 
algorithm  introduced  by  Dantzig  in  1951.   There  are  however  some 
useful  large-scale  linear  programs  beyond  the  capability  of  the 
simplex  method  which  are  not  being  formulated  and  solved  at  present 
due  perhaps  to  their  size.   It  is  for  such  problems  that  iterative 
techniques  seem  to  be  a  viable  alternative.   We  have  surveyed  two 
such  techniques  —  relaxation  methods  and  subgradient  methods  —  in 
considerable  detail.   It  has  also  been  shown  that  most  of  the  recent 
subgradient  methods  are  mathematically  subsumed  by  the  Generalized 
Merzlyakov  method.   The  latter  along  with  relaxation  methods  in 
general  are  computationally  far  simpler.   In  this  dissertation 
another  algorithm  using  relaxation  method  has  been  presented  which 
possesses  a  linear  rate  of  convergence.   The  new  algorithm  has  been 
coded  and  an  attempt  made  to  find  for  it  a  favorable  set  of 
parameters.   A  variant  of  the  algorithm  has  been  seen  experimentally 
to  have  considerable  merit  over  the  present  relaxation  methods. 
Convergence  properties  of  this  algorithm  are  still  not  attractive 
enough  to  be  of  practical  value.   However,  it  is  hoped  that  these 
could  be  improved  upon  and  an  algorithm  found  that  can  economically 
solve  large  scale  unstructured  linear  programs. 


APPENDIX 

The   Test    Problem 

min.    1516x,    +  1238x^   +  963x^   +  750x,    +  450x^   +  1200x^   +  lA30x^ 
1  2  J  4  5  6  7 

+  1300x     +  1260JC     +  1300x        +  870:x       +  500x        +  1400x 

+  1360x,,    +  2160X,,.   +  2325x,,    +   1095x,^   +  2720x,- 
14  Ij  lb  i  /  io 

+  1440x^g   +  SOOx^Q 
subject    to 

^6   +  ^7  +  "18  •*■  ^9   =   ^"^ 

^4'""i5  -""le  ^  ^7  ^^s'-^ig  =  ^"^ 

"l  -^  "6   -^  "7  ^  ^4   -^  ^5   ^  ^6   "■  \8  ■"  \9   ^   '' 

"l  +  "2  "■  "6  ^  "7  "^  "8  "■  "9  '■  "13  +  "14  "^  "l5  "^  "l6  ■*■  "l8  =  ^° 
x^  +  X2  +  X3  +  x^  +  x^  +  Xg  +  x^g  +  ^  13  +  ^14  +  ^15  +  ^15  +  ^ig  =  ^^ 
x^  +  x^  +  X3  +  x^  +  x^  +  xg  +  Xg  +  x^Q  +  x^^  +  x^3  +  x^3  +  x^^  +  x^g  >  65 
x^  +  x^  +  X3  +  x^  +  X3  +  x^  +  Xg  +  x^Q  +  x^^  +  x^2  ■"   "13  ^  "15 

■*-  "18  ^  "20  =^  ^° 

"16  ■*■  "17  =  ^2 

"18  ^  "19  =  ^^ 

"13  ^  1° 

"14  +  "15  ^  ^5 

-"16  -^  "20  =<  ° 

"1  +  "2  ^  "6  "■  "7  ■"  "8  ^  "9  -  "l4  -  "15  ^  "17  ■"  "19  ^  3^ 

"3  ^  "4  ^  "10  +  "11  -  "13  =  ^° 

"5  +  "6  ^  "8  ^  "12  +  "14  ""  "16  -  "20  =  ^^ 

"10  -^  "11  ■*■  "13  •*■  "18  =<  ^° 


"7  -^  "9  ^  "12  ^  "17  ^  "20  ^  ^° 


X.  >0,  1=1,  ...,20 
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Note :   Problem  abstracted  from  Case  Study  on  "Lease-Buy  Planning 
Decisions"  in  Salkin,  H.  M. ,  and  J,  Saha  Studies  in  Linear 
Programming,  19  75,  pp.  93-114. 
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